{"id":15502958,"url":"https://github.com/robsteranium/csvwr","last_synced_at":"2025-06-25T19:03:01.497Z","repository":{"id":45838868,"uuid":"297640003","full_name":"Robsteranium/csvwr","owner":"Robsteranium","description":"Read and write CSV on the Web (csvw) tables and metadata in R","archived":false,"fork":false,"pushed_at":"2024-01-21T17:05:52.000Z","size":989,"stargazers_count":16,"open_issues_count":8,"forks_count":1,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-06-20T12:01:10.640Z","etag":null,"topics":["csvw"],"latest_commit_sha":null,"homepage":"https://robsteranium.github.io/csvwr","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Robsteranium.png","metadata":{"files":{"readme":"README.md","changelog":"NEWS.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-09-22T12:21:57.000Z","updated_at":"2025-04-09T10:16:53.000Z","dependencies_parsed_at":"2025-01-29T11:41:16.766Z","dependency_job_id":null,"html_url":"https://github.com/Robsteranium/csvwr","commit_stats":null,"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"purl":"pkg:github/Robsteranium/csvwr","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Robsteranium%2Fcsvwr","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Robsteranium%2Fcsvwr/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Robsteranium%2Fcsvwr/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Robsteranium%2Fcsvwr/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Robsteranium","download_url":"https://codeload.github.com/Robsteranium/csvwr/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Robsteranium%2Fcsvwr/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261937019,"owners_count":23232843,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csvw"],"created_at":"2024-10-02T09:11:41.266Z","updated_at":"2025-06-25T19:03:01.478Z","avatar_url":"https://github.com/Robsteranium.png","language":"HTML","funding_links":[],"categories":[],"sub_categories":[],"readme":"# CSV on the Web R Package (csvwr) \u003cimg src=\"man/figures/logo.png\" align=\"right\" height=\"139\" /\u003e\n\n[![build](https://github.com/Robsteranium/csvwr/actions/workflows/r.yml/badge.svg)](https://github.com/Robsteranium/csvwr/actions/workflows/r.yml)\n[![pkgdown](https://github.com/Robsteranium/csvwr/actions/workflows/pkgdown.yml/badge.svg)](https://github.com/Robsteranium/csvwr/actions/workflows/pkgdown.yml)\n[![CRAN_Status_Badge](https://www.r-pkg.org/badges/version/csvwr)](https://cran.r-project.org/package=csvwr)\n\nRead and write csv tables annotated with metadata according to the \"CSV on the Web\" standard (CSVW).\n\nThe [csvw model for tabular data](https://w3c.github.io/csvw/syntax/) describes how to annotate a group of csv tables to ensure they are interpreted correctly.\n\nThis package uses the [csvw metadata schema](https://w3c.github.io/csvw/metadata/) to find tables, identify column names and cast values to the correct types.\n\nThe aim is to reduce the amount of manual work needed to parse and prepare data before it can be used in analysis.\n\n\n## Usage\n\n### Reading CSVW\n\nYou can use `csvwr` to read a csv table with json annotations into a data frame:\n\n```r\nlibrary(csvwr)\n\n# Parse a csv table using json metadata :\ncsvw \u003c- read_csvw(\"data.csv\", \"metadata.json\")\n\n# To extract the parsed table (with syntactic variable names and typed-columns):\ncsvw$tables[[1]]$dataframe\n```\n\nAlternatively, you can jump straight to the parsed table in one call:\n\n```\nread_csvw_dataframe(\"data.csv\", \"metadata.json\")\n```\n\n### Writing CSVW\n\nYou can also prepare annotations for a data frame:\n\n```r\n# Given a data frame (saved as a csv)\nd \u003c- data.frame(x=c(\"a\",\"b\",\"c\"), y=1:3)\nwrite.csv(d, \"table.csv\", row.names=FALSE)\n\n# Derive a schema\ns \u003c- derive_table_schema(d)\n\n# Create metadata (as a list)\nm \u003c- create_metadata(tables=list(list(url=\"table.csv\", tableSchema=s)))\n\n# Serialise the metadata to JSON\nj \u003c- jsonlite::toJSON(m)\n\n# Write the json to a file\ncat(j, file=\"metadata.json\")\n```\n\nFor a complete introduction to the library please see the `vignette(\"read-write-csvw\")`.\n\n\n## Installation\n\nYou can install the latest release from CRAN:\n```r\ninstall.packages(\"csvwr\")\n```\n\nOr for the development version you can use devtools to install `csvwr` from GitHub:\n\n```r\ninstall.packages(\"devtools\")\ndevtools::install_github(\"Robsteranium/csvwr\")\n```\n\n## Contributing\n\n### Roadmap\n\nBroadly speaking, the objectives are as follows:\n\n- parse csvw, creating dataframes with specified names and types (mostly implemented)\n- connecting associated csv tables and json files according to the conventions set out in the csvw standard (partly implemented)\n- support for validating a table according to a metadata document (a little implemented)\n- support for multiple tables (mostly implemented)\n- tools for writing csvw metadata, given an R data frame (partly implemented)\n- vignettes and documentation (mostly implemented)\n- scripts for running the most useful tools from the command line (not yet implemented)\n\nIt's not an urgent objective for the library to perform csv2rdf or csv2json translation although some support for csv2json is provided as this is used to test that the parsing is done correctly.\n\nIn terms of the csvw test cases provided by the standard, the following areas need to be addressed (in rough priority order):\n\n- datatypes (most of simple datatypes and some complex ones are supported, but there are more types and constraints too)\n- validations (there are a lot of these 😊)\n- propagation of inherited properties\n- http retrieval (`readr::read_csv` (and indeed `utils::read.csv`) accepts URIs, but the spec also involves link, dialect, and content-type headers)\n- referential integrity (a foundation for this is in place)\n- json nesting\n\n### Testing\n\nThe project currently incorporates two main parts of the [csvw test](https://w3c.github.io/csvw/tests/) suite:\n\n- [Parsing with JSON output](https://github.com/Robsteranium/csvwr/blob/master/tests/testthat/test-csvw-parsing-json.R)\n- [Validation](https://github.com/Robsteranium/csvwr/blob/master/tests/testthat/test-csvw-validation.R)\n\nIn each case, we're running only that subset of test entries that can be expected to pass given that part of the standard that has thus far been implemented. Some entries will be skipped (either permanently or) while other priorities are implemented.\n\nYou can find out what needs to be implemented next by widening the subset to include the next entry.\n\nDuring development, you may find it convenient to recreate one of the test entries for exploration. There is a convenience function in [tests/csvw-tests-helpers.R](https://github.com/Robsteranium/csvwr/blob/master/tests/csvw-tests-helpers.R). This isn't exported by the package so you'll need to evaluate it explicitly. You can then use it as follows:\n\n```r\nrun_entry_in_dev(16) # index number in the list of entries\nrun_entry_in_dev(id=\"manifest-json#test023\") # identifier for the test\n```\n\nThere are also some more [in-depth unit tests](https://github.com/Robsteranium/csvwr/blob/master/tests/testthat/test-parsing.R) written for this library.\n\nWe use GitHub actions to test the package against multiple architectures and the current, previous and development versions of R. If you need to test against the R-devel locally then you can use the `r-devel.Dockerfile`:\n\n```shell\ndocker build -f r-devel.Dockerfile . --tag csvw-devel\ndocker run --rm \"csvw-devel\"\n```\n\n### Workflow\n\nYou can use `devtools::load_all()` (`CTRL + SHIFT + L` in RStudio) to load updates and `testthat::test_local()` (`CTRL + SHIFT + T`) to run the tests.\n\nIn order to check the vignettes, you need to do `devtools::install(build_vignettes=T)`. Then you can open e.g. `vignette(\"read-write-csvw\")`.\n\n## License\n\nGPL-3\n\nTo discuss other licensing terms, please [get in contact](mailto:csvw@infonomics.ltd.uk).\n\n## Other CSVW tools\n\nThere's another R implementation of csvw in the package [rcsvw](https://github.com/davideceolin/rcsvw).\n\nIf you're interested in csvw more generally, then the [RDF::Tabular](https://github.com/ruby-rdf/rdf-tabular/) ruby gem provides one of the more robust and comprehensive implementations, supporting both translation and validation.\n\nIf you're specifically interested in validation, take a look at the [ODI](https://theodi.org/)'s [csvlint](https://github.com/Data-Liberation-Front/csvlint.rb) which implements csvw and also the [OKFN](https://okfn.org/)'s [frictionless data table schemas](https://specs.frictionlessdata.io/).\n\nIf you want rdf translation, then you might like to check out [Swirrl](https://www.swirrl.com/)'s [csv2rdf](https://github.com/Swirrl/csv2rdf/) and also [table2qb](https://github.com/swirrl/table2qb) which generates csvw annotations from csv files to describe [RDF Data Cubes](https://www.w3.org/TR/vocab-data-cube/).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frobsteranium%2Fcsvwr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frobsteranium%2Fcsvwr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frobsteranium%2Fcsvwr/lists"}