Recognize and Parse Dates in Various Formats, Including All ISO 8601 Formats

Parse dates automatically, without the need of specifying a format. Currently it includes the git date parser. It can also recognize and parse all ISO 8601 formats.


Linux Build Status Windows build status CRAN RStudio mirror downloads Coverage Status

This R package has three functions for dealing with dates.

  • parse_iso_8601 recognizes and parses all valid ISO 8601 date and time formats. It can also be used as an ISO 8601 validator.
  • parse_date can parse a date when you don't know which format it is in. First it tries all ISO 8601 formats. Then it tries git's versatile date parser. Lastly, it tries as.POSIXct.
  • format_iso_8601 formats a date (and time) in specific ISO 8601 format.

Limitations

The git parser does not work for dates before 1970 and after 2100. For these dates the current year is used instead:

parse_date("april 15 1971")
## Error in parse_date("april 15 1971"): could not find function "parse_date"
parse_date("april 15 1969")
## Error in parse_date("april 15 1969"): could not find function "parse_date"
parse_date("april 15 2110")
## Error in parse_date("april 15 2110"): could not find function "parse_date"

Parsing ISO 8601 dates

parse_iso_8601 recognizes all valid ISO 8601 formats, and gives an NA for invalid dates. Here are some examples

Dates with missing fields

library(parsedate)
parse_iso_8601("2013-02-08 09")
## [1] "2013-02-08 09:00:00 UTC"
parse_iso_8601("2013-02-08 09:30")
## [1] "2013-02-08 09:30:00 UTC"

Separator between date and time

parse_iso_8601("2013-02-08T09")
## [1] "2013-02-08 09:00:00 UTC"
parse_iso_8601("2013-02-08T09:30")
## [1] "2013-02-08 09:30:00 UTC"
parse_iso_8601("2013-02-08T09:30:26")
## [1] "2013-02-08 09:30:26 UTC"

Fractional seconds, minutes, hours

parse_iso_8601("2013-02-08T09:30:26.123")
## [1] "2013-02-08 09:30:26 UTC"
parse_iso_8601("2013-02-08T09:30.5")
## [1] "2013-02-08 09:30:30 UTC"
parse_iso_8601("2013-02-08T09,25")
## [1] "2013-02-08 09:15:00 UTC"

Zulu time zone is UTC

parse_iso_8601("2013-02-08T09:30:26Z")
## [1] "2013-02-08 09:30:26 UTC"

ISO weeks are parsed properly

parse_iso_8601("2013-W06-5")
## [1] "2013-02-08 UTC"
parse_iso_8601("2013-W01-1")
## [1] "2012-12-31 UTC"
parse_iso_8601("2009-W01-1")
## [1] "2008-12-29 UTC"
parse_iso_8601("2009-W53-7")
## [1] "2010-01-03 UTC"

Day of the year

parse_iso_8601("2013-039")
## [1] "2013-02-08 UTC"
parse_iso_8601("2013-039 09:30:26Z")
## [1] "2013-02-08 09:30:26 UTC"

Guess the format of the date, and parse it

Sometimes one has to work with a large number of dates, in arbitrary formats. It is of impossible to reliably guess the format of some dates, because of ambiguity. But it is often not critical to get the date exactly right in the ambiguous cases, and this is when the parse_date function is useful. It tries a large number of formats, here is the algorithm is uses:

  1. Try parsing dates using all valid ISO 8601 formats, by calling parse_iso_8601.
  2. If this fails, then try parsing them using the git date parser.
  3. If this fails, then try parsing them using as.POSIXct. (It is unlikely that this step will parse any dates that the first two steps couldn't, but it is still a logical fallback, to make sure that we can parse at least as many dates as as.POSIXct.

Here are some examples. The first ones are easy.

parse_date("2014-12-12")
## [1] "2014-12-12 UTC"
parse_date("04/15/99")
## [1] "1999-04-15 UTC"
parse_date("15/04/99")
## [1] "1999-04-15 UTC"

Ambiguous formats

The following formats are ambiguous and are parsed as month/day/year.

parse_date("12/11/99")
## [1] "1999-12-11 UTC"
parse_date("11/12/99")
## [1] "1999-11-12 UTC"

Fill in the current date and time for missing fields

parse_date("03/20")
## [1] "2019-03-20 UTC"
parse_date("12")
## [1] "2019-05-12 UTC"

But not for this, because this is ISO 8601.

parse_date("2014")
## [1] "2014-01-01 UTC"

Formatting dates as ISO 8601

The format_iso_8601 function formats a date (and time) in a fixed format that is ISO 8601 valid, and can be used to compare dates as character strings. It converts the date(s) to UTC.

format_iso_8601(parse_iso_8601("2013-02-08"))
## [1] "2013-02-08T00:00:00+00:00"
format_iso_8601(parse_iso_8601("2013-02-08 09:34:00"))
## [1] "2013-02-08T09:34:00+00:00"
format_iso_8601(parse_iso_8601("2013-02-08 09:34:00+01:00"))
## [1] "2013-02-08T08:34:00+00:00"
format_iso_8601(parse_iso_8601("2013-W06-5"))
## [1] "2013-02-08T00:00:00+00:00"
format_iso_8601(parse_iso_8601("2013-039"))
## [1] "2013-02-08T00:00:00+00:00"

News

parsedate 1.2.0

  • parse_date() and parse_iso_8601() now dupport a default time zone, that will be used for dates that do not explicitly specify one.

  • Reimplement parse_iso_8601() with vectorized code, for speed (#9).

  • Fix parse_date() and parse_iso_8601() for zero-length input (#20).

  • parse_date() parses strings with + characters correctly now (#23).

Parsedate 1.1.3

  • Update URLs in the README

Parsedate 1.1.2

  • Fix a vectorization bug in the ISO 8601 date parser
  • Register native routines

Parsedate 1.1.1

  • Drop lubridate package dependency

  • Fix parsing dates consisting of six or eight digits, e.g. 20140922 and 092214

  • NA is returned by parse_date for non-sensical numerical dates, e.g. 000202

  • Fix parse_date time zone that was wrong for some dates

  • Fix parse_date for dates are not in DST

Parsedate 1.0.3

  • Fix a bug in format_iso_8601, on platforms that have a buggy %z

Parsedate 1.0.2

  • First release on CRAN

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("parsedate")

1.2.0 by Gábor Csárdi, a year ago


https://github.com/gaborcsardi/parsedate


Report a bug at https://github.com/gaborcsardi/parsedate/issues


Browse source code at https://github.com/cran/parsedate


Authors: Gábor Csárdi , Linus Torvalds


Documentation:   PDF Manual  


GPL-2 license


Imports rematch2

Suggests covr, testthat, withr


Imported by dataone, mds, pkgsearch, rerddapXtracto, rhub, shinytest, togglr.


See at CRAN