{"id":23782607,"url":"https://github.com/globalgov/messydates","last_synced_at":"2025-09-06T01:32:44.930Z","repository":{"id":38848085,"uuid":"384110337","full_name":"globalgov/messydates","owner":"globalgov","description":"R package for Extended Date/Time Format (EDTF)","archived":false,"fork":false,"pushed_at":"2025-06-02T14:00:03.000Z","size":15678,"stargazers_count":17,"open_issues_count":1,"forks_count":2,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-07-11T18:39:40.789Z","etag":null,"topics":["dates","r"],"latest_commit_sha":null,"homepage":"https://globalgov.github.io/messydates","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/globalgov.png","metadata":{"files":{"readme":"README.Rmd","changelog":"NEWS.md","contributing":".github/CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":".github/CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2021-07-08T12:03:00.000Z","updated_at":"2025-06-11T05:55:04.000Z","dependencies_parsed_at":"2025-06-02T02:05:08.080Z","dependency_job_id":"a7e61f60-6204-437d-8ff7-d03b36ea8d87","html_url":"https://github.com/globalgov/messydates","commit_stats":null,"previous_names":[],"tags_count":20,"template":false,"template_full_name":null,"purl":"pkg:github/globalgov/messydates","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/globalgov%2Fmessydates","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/globalgov%2Fmessydates/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/globalgov%2Fmessydates/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/globalgov%2Fmessydates/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/globalgov","download_url":"https://codeload.github.com/globalgov/messydates/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/globalgov%2Fmessydates/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273846959,"owners_count":25178627,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-05T02:00:09.113Z","response_time":402,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dates","r"],"created_at":"2025-01-01T12:16:55.160Z","updated_at":"2025-09-06T01:32:44.917Z","avatar_url":"https://github.com/globalgov.png","language":"R","readme":"---\noutput: github_document\nalways_allow_html: true\n---\n\n# messydates \u003cimg src=\"man/figures/messydates_hexlogo.png\" alt=\"messydates package logo\" align=\"right\" width=\"220\"/\u003e\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/README-\",\n  out.width = \"100%\"\n)\n```\n\n\u003c!-- badges: start --\u003e\n[![Lifecycle: maturing](https://img.shields.io/badge/lifecycle-maturing-blue.svg)](https://lifecycle.r-lib.org/articles/stages.html#maturing)\n![CRAN/METACRAN](https://img.shields.io/cran/v/messydates)\n![GitHub release (latest by date)](https://img.shields.io/github/v/release/globalgov/messydates)\n![GitHub Release Date](https://img.shields.io/github/release-date/globalgov/messydates)\n\u003c!-- ![GitHub issues](https://img.shields.io/github/issues-raw/globalgov/messydates) --\u003e\n[![Codecov test coverage](https://codecov.io/gh/globalgov/messydates/branch/main/graph/badge.svg)](https://app.codecov.io/gh/globalgov/messydates?branch=main)\n[![CodeFactor](https://www.codefactor.io/repository/github/globalgov/messydates/badge)](https://www.codefactor.io/repository/github/globalgov/messydates)\n[![CII Best Practices](https://bestpractices.coreinfrastructure.org/projects/5061/badge)](https://bestpractices.coreinfrastructure.org/projects/5061)\n\u003c!-- badges: end --\u003e\n\n## Why this package?\n\nExisting packages for working with dates in R expect them to be _tidy_.\nThat is, they should be in or coercible to the standard `yyyy-mm-dd` format.\n\nBut dates are often **_messy_**.\nSometimes we only know the year when something happened,\nleaving other components of the date, such as the month or day, _unspecified_.\nThis is often the case with historical dates, for instance.\nSometimes we can only say _approximately_ when an event occurred,\nthat it occurred _before_ or _after_ a certain date,\nor we recognise that our best estimate comes from a _dubious_ source.\nOther times there exists a _set_ or _range_ of possible dates for an event.\n\nAlthough researchers generally recognise this messiness,\nmany feel expected to force artificial precision or unfortunate imprecision \non temporal data to proceed with analysis.\nFor example, if we only know something happened in `2021`,\nthen we might revert to a panel data design\n_even if greater precision is available_,\nor opt to replace this date with the start of that year (`2021-01-01`),\nassuming that erring on the earlier (or later) side is more justifiable than \na random date within that month or year.\n\nHowever, this can create inferential issues when timing or sequence is important.\n`{messydates}` assists with this problem by retaining and working\nwith various kinds of date imprecision.\n\n```{r setup, echo=FALSE, include=FALSE, warning=FALSE}\nlibrary(lubridate)\nlibrary(tibble)\nlibrary(dplyr)\nlibrary(knitr)\nlibrary(kableExtra)\n```\n\n## A quick overview\n\n`{messydates}` implements for R the Extended Date/Time Format (EDTF) annotations\nset by the International Organization for Standardization (ISO) \noutlined in [ISO 8601-2_2019(E)](https://www.iso.org/standard/70908.html).\n`{messydates}` introduces a new `mdate` class that embeds these annotations,\nand offers a set of methods for constructing and coercing into and from the `mdate` class,\nas well as tools for working with such 'messy' dates.\n\n```{r comparison, warning=FALSE, message=FALSE}\npkg_comparison \u003c- tibble::tribble(~Example, ~OriginalDate,\n                                  \"Normal date\", \"2012-01-01\",\n                                  \"Future date\", \"2599-12-31\",\n                                  \"Historical date\", \"476\",\n                                  \"Era date\", \"33 BC\",\n                                  \"Written date\", \"First of February, two thousand and twelve\",\n                                  \"DMY date\", \"10-31-2012\",\n                                  \"MDY date\", \"31-10-2012\",\n                                  \"Wrongly specified date\", \"2012-31-10\",\n                                  \"Approximate date\", \"2012-01-12~\",\n                                  \"Uncertain date\", \"2012-01-01?\",\n                                  \"Unspecified date\", \"2012-01\",\n                                  \"Censored date\", \"..2012-01-12\", \n                                  \"Range of dates\", \"2012-11-01:2012-12-01\",\n                                  \"Set of dates\", \"2012-5-26, 2012-11-19, 2012-12-4\") %\u003e%\n  dplyr::mutate(base = as.Date(OriginalDate),\n                lubridate = suppressWarnings(lubridate::as_date(OriginalDate)),\n                messydates = messydates::as_messydate(OriginalDate))\n```\n\n```{r compprint, echo=FALSE, results='asis'}\nkbl(pkg_comparison) %\u003e% kable_styling(\"striped\") %\u003e% \n  column_spec(3, \n              color = ifelse((paste(pkg_comparison$base) == paste(pkg_comparison$messydates)),\n                           \"black\", \"red\")) %\u003e% \n  column_spec(4, \n              color = ifelse((paste(pkg_comparison$lubridate) == paste(pkg_comparison$messydates)),\n                           \"black\", \"red\")) %\u003e% \n  column_spec(5, \n              color = ifelse((paste(pkg_comparison$messydates) == paste(pkg_comparison$messydates)),\n                           \"black\", \"red\"))\n```\n\nAs can be seen in the table above,\nother date/time packages in R do not handle 'messy' dates well.\nNormal \"yyyy-mm-dd\" structures or other date formats that can easily be\ncoerced into this structure are usually not a problem.\n\nHowever, some syntaxes are entirely ignored,\nsuch as historical dates and dates from other eras (e.g. BCE),\nas well as written dates, frequently used in historical texts or treaties.\n\nOther times, existing packages return a date, but strip away any annotations \nthat express uncertainty or approximateness,\nintroducing artificial precision.\n\nAnd sometimes returning only a single date means ignoring other information included.\nWe see this here in how only the end of the censored date,\nonly the start of the date range, or the first in the set of dates is returned.\nSometimes date components even seem guessed, such as how `2021-01` (January 2021)\nis assumed to be 1 _December_ 2021 by `{lubridate}`.\n\nSo only `{messydates}` enables researchers to retain all this information.\nBut most analysis does still expect some precision in dates to work.\n\n## Working with messy dates\n\nThe first way that `{messydates}` assists researchers that use dates in `mdate` class\nis to provide methods for converting back into common date classes \nsuch as `Date`, `POSIXct`, and `POSIXlt`.\nIt is thus fully compatible with packages such as `{lubridate}` and `{anydate}`.\n\nAs messy date annotations can indicate multiple possible dates,\n`{messydates}` allows e.g. ranges or sets of dates to be \nunpacked or expanded into all compatible dates.\n\nSince most methods of analysis or modelling expect single date observations,\nwe offer ways to resolve this multiplicity when coercing `mdate`-class objects\ninto other date formats.\nFor example, researcher might explicitly choose to favour\nthe `min()`, `max()`, `mean()`, `median()`, or even a `random()` date.\nThis greatly facilitates research transparency by demanding a conscious choice from researchers,\nas well as supporting robustness checks by enabling description or inference across\ndates compatible with the messy annotated date.\n\n```{r work}\nresolve_mdate \u003c- pkg_comparison %\u003e% \n  dplyr::select(messydates) %\u003e% \n  dplyr::mutate(min = as.Date(messydates, min),\n         median = as.Date(messydates, median),\n         max = as.Date(messydates, max))\n```\n\n```{r workprint, echo = FALSE, results='asis'}\nkbl(resolve_mdate) %\u003e% kable_styling(\"striped\")\n```\n\nAs can be seen in the table above, all 'precise' dates are respected as such,\nand returned no matter what 'resolution' function is given.\nBut for messy dates, the choice of function can make a difference.\nWhere only a year is given, e.g. `0476` or `-0033`,\nwe draw from all the days in the year.\nThe minimum is the first of January and the maximum the 31st of December.\nDates are also drawn from a set or range of dates when given.\n\nWhen only an approximate or censored date is known,\nthen depending on whether the whole date or just a component of the date is annotated,\nthen a range of dates is imputed based on some window (by default 3 years, months, or days),\nand then a precise date is resolved from that.\n\nThis translation via an expanded list of compatible dates is fast, robust, and extensible,\nallowing researchers to use messy dates in an analytic strategy that uses any other package.\n\n## Cheat Sheet\n\nPlease see the cheat sheet and [the messydates website](https://globalgov.github.io/messydates/) for more information about \nhow to use `{messydates}`.\n\n\u003ca href=\"https://github.com/globalgov/messydates/blob/main/inst/figures/cheatsheet.pdf\"\u003e\u003cimg src=\"https://raw.githubusercontent.com/globalgov/messydates/main/man/figures/cheatsheet.png\" alt=\"messydates cheatsheet\" width=\"525\" height=\"378\"/\u003e\u003c/a\u003e\n\n## Installation\n\nThe easiest way to install `{messydates}` is directly from CRAN:\n\n``` r\ninstall.packages(\"messydates\")\n```\n\nHowever, you may also install the development version from [GitHub](https://github.com/).\n\n``` {r git, eval=FALSE}\n# install.packages(\"remotes\")\nremotes::install_github(\"globalgov/messydates\")\n```\n\n## Funding\n\nThe package was developed as part of [the PANARCHIC project](https://panarchic.ch), \nwhich studies the effects of network and power on how quickly states join, reform, or create \ninternational institutions by examining the historical dynamics of institutional networks from different domains.\n\nThe PANARCHIC project is funded by the Swiss National Science Foundation ([SNSF](https://data.snf.ch/grants/grant/188976)).\nFor more information on current projects of the Geneva Global Governance Observatory, \nplease see [our Github website](https://github.com/globalgov).\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fglobalgov%2Fmessydates","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fglobalgov%2Fmessydates","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fglobalgov%2Fmessydates/lists"}