{"id":31790756,"url":"https://github.com/ropensci/ediutils","last_synced_at":"2026-01-16T02:38:10.530Z","repository":{"id":38442614,"uuid":"159572464","full_name":"ropensci/EDIutils","owner":"ropensci","description":"An API Client for the Environmental Data Initiative Repository","archived":false,"fork":false,"pushed_at":"2023-10-10T19:09:51.000Z","size":1085,"stargazers_count":9,"open_issues_count":8,"forks_count":2,"subscribers_count":9,"default_branch":"main","last_synced_at":"2025-09-08T16:11:10.535Z","etag":null,"topics":["ecology","eml-metadata","open-access","open-data","r","r-package","research-data-management","research-data-repository","rstats"],"latest_commit_sha":null,"homepage":"https://docs.ropensci.org/EDIutils/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ropensci.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"docs/CODE_OF_CONDUCT.html","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2018-11-28T22:13:59.000Z","updated_at":"2025-03-22T08:13:44.000Z","dependencies_parsed_at":"2023-10-05T03:57:19.620Z","dependency_job_id":"473de4f1-cfd2-4941-a70e-58e9eab44f27","html_url":"https://github.com/ropensci/EDIutils","commit_stats":{"total_commits":244,"total_committers":2,"mean_commits":122.0,"dds":0.004098360655737654,"last_synced_commit":"790f5e270ec973ebe231f623985dbd494dd7e429"},"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/ropensci/EDIutils","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ropensci%2FEDIutils","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ropensci%2FEDIutils/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ropensci%2FEDIutils/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ropensci%2FEDIutils/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ropensci","download_url":"https://codeload.github.com/ropensci/EDIutils/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ropensci%2FEDIutils/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279004707,"owners_count":26083750,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-10T02:00:06.843Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ecology","eml-metadata","open-access","open-data","r","r-package","research-data-management","research-data-repository","rstats"],"created_at":"2025-10-10T16:29:37.101Z","updated_at":"2025-10-10T16:29:42.707Z","avatar_url":"https://github.com/ropensci.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  # fig.path = \"man/figures/README-\",\n  fig.path = \"README-\",\n  out.width = \"100%\"\n)\n```\n\n# EDIutils\n\n\u003c!-- badges: start --\u003e\n[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)\n[![R-CMD-check](https://github.com/ropensci/EDIutils/workflows/R-CMD-check/badge.svg)](https://github.com/ropensci/EDIutils/actions)\n[![Status at rOpenSci Software Peer Review](https://badges.ropensci.org/498_status.svg)](https://github.com/ropensci/software-review/issues/498)\n[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/EDIutils)](https://cran.r-project.org/package=EDIutils)\n[![codecov.io](https://codecov.io/gh/ropensci/EDIutils/branch/main/graph/badge.svg)](https://app.codecov.io/github/ropensci/EDIutils?branch=main)\n[![DOI](https://zenodo.org/badge/159572464.svg)](https://zenodo.org/badge/latestdoi/159572464)\n\n\u003c!-- badges: end --\u003e\n\nA client for the Environmental Data Initiative repository REST API. The [EDI data repository](https://portal.edirepository.org/nis/home.jsp) is for publication and reuse of ecological data with emphasis on metadata accuracy and completeness. It was developed in collaboration with the [US LTER Network](https://lternet.edu/) and is built upon the [PASTA+ software stack](https://pastaplus-core.readthedocs.io/en/latest/index.html#). EDIutils includes functions to search and access existing data, evaluate and upload new data, and assist with related data management tasks.\n\n- [Search and Access Data](https://docs.ropensci.org/EDIutils/articles/search_and_access.html)\n- [Evaluate and Upload Data](https://docs.ropensci.org/EDIutils/articles/evaluate_and_upload.html)\n- [Retrieve Download Metrics](https://docs.ropensci.org/EDIutils/articles/retrieve_downloads.html)\n- [Retrieve Citation Metrics](https://docs.ropensci.org/EDIutils/articles/retrieve_citations.html)\n\n## Installation\n\nGet the latest version:\n```{r eval=FALSE}\ninstall.packages(\"EDIutils\")\n```\n\nGet the development version:\n```{r eval=FALSE}\nremotes::install_github(\"ropensci/EDIutils\", ref = \"development\")\n```\n\n## Getting Started\n\n```{r eval=FALSE}\nlibrary(EDIutils)\n```\n\nThe unit of publication is the data package. It contains one or more data entities (i.e. files) described with [EML metadata](https://eml.ecoinformatics.org/), a metadata quality report, and a manifest of package contents. Data packages are immutable for reproducible research, yet versionable to allow updates and improved data quality through time. Each version is assigned a DOI and a unique package ID of the form \"scope.identifier.revision\". The \"scope\" is the organizational unit, \"identifier\" the series, and \"revision\" the version (e.g. \"edi.100.2\" is version \"2\" of data package \"edi.100\").\n\n### Authentication\n\nAuthentication is required by data evaluation and upload functions, and to \naccess user audit logs and services. Contact EDI for an account \n\u003csupport@edirepository.org\u003e. Authenticate with the `login()` \nfunction.\n\n### Search and Access Data\n\nThe repository search service is a standard deployment of Apache Solr and indexes select metadata fields of data package metadata. For a list of searchable fields see `search_data_packages()`. For a browser based search experience, use the [EDI data portal](https://portal.edirepository.org/nis/advancedSearch.jsp).\n\n```{r eval=FALSE}\n# List data packages containing the term \"water temperature\"\nres \u003c- search_data_packages(query = 'q=\"water+temperature\"\u0026fl=*')\ncolnames(res)\n#\u003e  [1] \"abstract\"              \"begindate\"             \"doi\"                  \n#\u003e  [4] \"enddate\"               \"funding\"               \"geographicdescription\"\n#\u003e  [7] \"id\"                    \"methods\"               \"packageid\"            \n#\u003e [10] \"pubdate\"               \"responsibleParties\"    \"scope\"                \n#\u003e [13] \"site\"                  \"taxonomic\"             \"title\"                \n#\u003e [16] \"authors\"               \"spatialCoverage\"       \"sources\"              \n#\u003e [19] \"keywords\"              \"organizations\"         \"singledates\"          \n#\u003e [22] \"timescales\"\n\nnrow(res)\n#\u003e [1] 798\n```\n\nData entities are downloaded in raw bytes and parsed by a reader function.\n\n```{r eval=FALSE}\n# List data entities of data package edi.1047.1\nres \u003c- read_data_entity_names(packageId = \"edi.1047.1\")\nres\n#\u003e                           entityId                entityName\n#\u003e 1 3abac5f99ecc1585879178a355176f6d        Environmentals.csv\n#\u003e 2 f6bfa89b48ced8292840e53567cbf0c8               ByCatch.csv\n#\u003e 3 c75642ddccb4301327b4b1a86bdee906               Chinook.csv\n#\u003e 4 2c9ee86cc3f3ffc729c5f18bfe0a2a1d             Steelhead.csv\n#\u003e 5 785690848dd20f4910637250cdc96819 TrapEfficiencyRelease.csv\n#\u003e 6 58b9000439a5671ea7fe13212e889ba5 TrapEfficiencySummary.csv\n#\u003e 7 86e61c1a501b7dcf0040d10e009bfd87        TrapOperations.csv\n\n# Read raw bytes of Steelhead.csv (i.e. the 4th data entity)\nraw \u003c- read_data_entity(packageId = \"edi.1047.1\", entityId = res$entityId[4])\nhead(raw)\n#\u003e [1] ef bb bf 44 61 74\n\n# Parse with a .csv reader\ndata \u003c- readr::read_csv(file = raw)\ndata\n#\u003e # A tibble: 2,926 x 14\n#\u003e    Date   trapVisitID subSiteName catchRawID releaseID commonName \n#\u003e    \u003cchr\u003e        \u003cdbl\u003e \u003cchr\u003e            \u003cdbl\u003e     \u003cdbl\u003e \u003cchr\u003e      \n#\u003e  1 1/12/~         326 North Chan~      32123         0 Steelhead ~\n#\u003e  2 1/14/~         336 North Chan~      33980         0 Steelhead ~\n#\u003e  3 1/15/~         337 North Chan~      32683         0 Steelhead ~\n#\u003e  4 1/16/~         339 North Chan~      32971         0 Steelhead ~\n#\u003e  5 1/17/~         341 North Chan~      33104         0 Steelhead ~\n#\u003e  6 1/18/~         342 North Chan~      33304         0 Steelhead ~\n#\u003e  7 1/19/~         343 North Chan~      33432         0 Steelhead ~\n#\u003e  8 1/21/~         349 North Chan~      34083         0 Steelhead ~\n#\u003e  9 1/21/~         349 North Chan~      34084         0 Steelhead ~\n#\u003e 10 1/23/~         351 North Chan~      34384         0 Steelhead ~\n#\u003e # ... with 2,916 more rows, and 8 more variables:\n#\u003e #   lifeStage \u003cchr\u003e, forkLength \u003cdbl\u003e, weight \u003cdbl\u003e, n \u003cdbl\u003e,\n#\u003e #   mort \u003cchr\u003e, fishOrigin \u003cchr\u003e, markType \u003cchr\u003e,\n#\u003e #   CatchRaw.comments \u003cchr\u003e\n```\n\n### Evaluate and Upload Data\n\nThe EDI data repository has a \"[staging](https://portal-s.edirepository.org/nis/home.jsp)\" environment to test the upload and rendering of new data packages before publishing to \"[production](https://portal.edirepository.org/nis/home.jsp)\". Authentication is required by functions involving data evaluation and upload. Request an account from support@edirepository.org.\n\n```{r eval=FALSE}\n# Authenticate\nlogin()\n#\u003e User name: \"my_name\"\n#\u003e User password: \"my_secret\"\n```\n\nData package reservations prevent conflicting use of the same identifier.\n\n```{r eval=FALSE}\n# Reserve a data package identifier\nidentifier \u003c- create_reservation(scope = \"edi\", env = \"staging\")\nidentifier\n#\u003e [1] 595\n```\n\nEvaluation checks for metadata accuracy and completeness.\n\n```{r eval=FALSE}\n\n# Evaluate data package\ntransaction \u003c- evaluate_data_package(\n eml = paste0(tempdir(), \"/edi.595.1.xml\"), \n env = \"staging\")\ntransaction\n#\u003e [1] \"evaluate_163966785813042760\"\n\n# Check status\nstatus \u003c- check_status_evaluate(transaction, env = \"staging\")\nstatus\n#\u003e [1] TRUE\n\n# Read the evaluation report\nreport \u003c- read_evaluate_report(transaction, as = \"char\", env = \"staging\")\nmessage(report)\n#\u003e ===================================================\n#\u003e   EVALUATION REPORT\n#\u003e ===================================================\n#\u003e   \n#\u003e PackageId: edi.595.1\n#\u003e Report Date/Time: 2021-12-16T08:17:40\n#\u003e Total Quality Checks: 29\n#\u003e Valid: 21\n#\u003e Info: 8\n#\u003e Warn: 0\n#\u003e Error: 0\n#\u003e \n#\u003e ---------------------------------------------------\n#\u003e   DATASET REPORT\n#\u003e ---------------------------------------------------\n#\u003e   \n#\u003e IDENTIFIER: packageIdPattern\n#\u003e NAME: packageId pattern matches \"scope.identifier.revision\"\n#\u003e DESCRIPTION: Check against LTER requirements for scope.identifier.revision\n#\u003e EXPECTED: 'scope.n.m', where 'n' and 'm' are integers and 'scope' is one ...\n#\u003e FOUND: edi.595.1\n#\u003e STATUS: valid\n#\u003e EXPLANATION: \n#\u003e SUGGESTION: \n#\u003e REFERENCE: \n#\u003e \n#\u003e IDENTIFIER: emlVersion\n#\u003e NAME: EML version 2.1.0 or beyond\n#\u003e DESCRIPTION: Check the EML document declaration for version 2.1.0 or higher\n#\u003e EXPECTED: eml://ecoinformatics.org/eml-2.1.0 or higher\n#\u003e FOUND: https://eml.ecoinformatics.org/eml-2.2.0\n#\u003e STATUS: valid\n#\u003e EXPLANATION: Validity of this quality report is dependent on this check ...\n#\u003e SUGGESTION: \n#\u003e REFERENCE: \n#\u003e ...\n```\n\nUpload after errors and warnings are fixed.\n\n```{r eval=FALSE}\n# Create a new data package\ntransaction \u003c- create_data_package(\n eml = paste0(tempdir(), \"/edi.595.1.xml\"), \n env = \"staging\")\ntransaction\n#\u003e [1] \"create_163966765080210573__edi.595.1\"\n\n# Check status\nstatus \u003c- check_status_create(\n transaction = transaction, \n env = \"staging\")\nstatus\n#\u003e [1] TRUE\n```\n\nOnce everything looks good in the \"staging\" environment, then repeat the above reservation and upload steps in the \"production\" environment where the data package will be assigned a DOI and made discoverable with other published data.\n\n## Getting help\n\nUse [GitHub Issues](https://github.com/ropensci/EDIutils/issues) for bug reporting, feature requests, and general questions/discussions. When filing bug reports, please include a minimal reproducible example.\n\n## Contributing\n\nCommunity contributions are welcome! Please reference our [contributing guidelines](https://github.com/ropensci/EDIutils/blob/master/CONTRIBUTING.md) for details.\n\n-----\n\nPlease note that this package is released with a [Contributor Code of Conduct](https://ropensci.org/code-of-conduct/). By contributing to this project, you agree to abide by its terms.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fropensci%2Fediutils","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fropensci%2Fediutils","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fropensci%2Fediutils/lists"}