{"id":31530186,"url":"https://github.com/frictionlessdata/datapackage-rb","last_synced_at":"2025-10-04T01:19:14.519Z","repository":{"id":62555660,"uuid":"14900475","full_name":"frictionlessdata/datapackage-rb","owner":"frictionlessdata","description":"Ruby library and tools for working with datapackages","archived":false,"fork":false,"pushed_at":"2021-08-20T08:05:22.000Z","size":284,"stargazers_count":11,"open_issues_count":7,"forks_count":5,"subscribers_count":9,"default_branch":"main","last_synced_at":"2025-09-28T21:49:28.459Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/frictionlessdata.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2013-12-03T17:17:34.000Z","updated_at":"2021-08-20T08:05:24.000Z","dependencies_parsed_at":"2022-11-03T05:31:05.713Z","dependency_job_id":null,"html_url":"https://github.com/frictionlessdata/datapackage-rb","commit_stats":null,"previous_names":["theodi/datapackage.rb"],"tags_count":30,"template":false,"template_full_name":null,"purl":"pkg:github/frictionlessdata/datapackage-rb","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/frictionlessdata%2Fdatapackage-rb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/frictionlessdata%2Fdatapackage-rb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/frictionlessdata%2Fdatapackage-rb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/frictionlessdata%2Fdatapackage-rb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/frictionlessdata","download_url":"https://codeload.github.com/frictionlessdata/datapackage-rb/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/frictionlessdata%2Fdatapackage-rb/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278252223,"owners_count":25956264,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-03T02:00:06.070Z","response_time":53,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-10-04T01:19:02.479Z","updated_at":"2025-10-04T01:19:14.510Z","avatar_url":"https://github.com/frictionlessdata.png","language":"Ruby","readme":"# datapackage-rb\n\n[![Build](https://img.shields.io/github/workflow/status/frictionlessdata/datapackage-rb/general/main)](https://github.com/frictionlessdata/datapackage-rb/actions)\n[![Coverage](https://img.shields.io/codecov/c/github/frictionlessdata/datapackage-rb/main)](https://codecov.io/gh/frictionlessdata/datapackage-rb)\n[![Release](http://img.shields.io/gem/v/datapackage.svg)](https://rubygems.org/gems/datapackage)\n[![Codebase](https://img.shields.io/badge/codebase-github-brightgreen)](https://github.com/frictionlessdata/datapackage-rb)\n[![Support](https://img.shields.io/badge/support-discord-brightgreen)](https://discordapp.com/invite/Sewv6av)\n\nA ruby library for working with [Data Packages](https://specs.frictionlessdata.io/data-package/).\n\nThe library is intending to support:\n\n* Parsing and using data package metadata and data\n* Validating data packages to ensure they conform with the Data Package specification\n\n## Installation\n\nAdd the gem into your Gemfile:\n\n```\ngem 'datapackage.rb'\n```\n\nOr:\n\n```\ngem install datapackage\n```\n\n## Reading a Data Package\n\nRequire the gem, if you need to:\n\n```ruby\nrequire 'datapackage'\n```\n\nParsing a data package descriptor from a remote location:\n\n```ruby\npackage = DataPackage::Package.new( \"http://example.org/datasets/a/datapackage.json\" )\n```\n\nThis assumes that `http://example.org/datasets/a/datapackage.json` exists.\nSimilarly you can load a package descriptor from a local JSON file.\n\n```ruby\npackage = DataPackage::Package.new( \"/my/data/package/datapackage.json\" )\n```\n\nThe data package descriptor\ni.e. `datapackage.json` file, is expected to be at the _root_ directory\nof the data package and the `path` attribute of the package's `resources` will be resolved\nrelative to it.\n\nYou can also load a data package descriptor directly from a Hash:\n\n```ruby\n descriptor = {\n  'resources'=\u003e [\n    {\n      'name'=\u003e 'example',\n      'profile'=\u003e 'tabular-data-resource',\n      'data'=\u003e [\n        ['height', 'age', 'name'],\n        ['180', '18', 'Tony'],\n        ['192', '32', 'Jacob'],\n      ],\n      'schema'=\u003e  {\n        'fields'=\u003e [\n          {'name'=\u003e 'height', 'type'=\u003e 'integer'},\n          {'name'=\u003e 'age', 'type'=\u003e 'integer'},\n          {'name'=\u003e 'name', 'type'=\u003e 'string'},\n        ],\n      }\n    }\n  ]\n}\n\npackage = DataPackage::Package.new(descriptor)\n```\n\nThere are a set of helper methods for accessing data from the package, e.g:\n\n```ruby\npackage.name\npackage.title\npackage.description\npackage.homepage\npackage.license\n```\n\n## Reading Data Resources\n\nA data package must contain an array of [Data Resources](https://specs.frictionlessdata.io/data-resource).\nYou can access the resources in your Data Package either by their name or by their index in the `resources` array:\n\n```ruby\nfirst_resource = package.resources[0]\nfirst_resource = package.get_resource('example')\n\n# Get info about the data source of this resource\nfirst_resource.inline?\nfirst_resource.local?\nfirst_resource.remote?\nfirst_resource.multipart?\nfirst_resource.tabular?\nfirst_resource.source\n```\n\nYou can then read the source depending on its type. For example if resource is local and not multipart it could by open as a file: `File.open(resource.source)`.\n\nIf a resource complies with the [Tabular Data Resource spec](https://specs.frictionlessdata.io/tabular-data-resource/) or uses the\n`tabular-data-resource` [profile](#profiles) you can read resource rows:\n\n```ruby\nresoure = package.resources[0]\nresource.tabular?\nresource.headers\nresource.schema\n\n# Read the the whole rows at once\ndata = resource.read\ndata = resource.read(keyed: true)\n\n# Or iterate through it\ndata = resource.iter {|row| print row}\n```\n\nSee [TableSchema](https://github.com/frictionlessdata/tableschema-rb) documentation for other things you can do with tabular resource.\n\n## Creating a Data Package\n\n```ruby\npackage = DataPackage::Package.new\n\n# Add package properties\npackage.name = 'my_sleep_duration'\n\n# Add a resource\npackage.add_resource(\n  {\n    'name'=\u003e 'sleep_durations_this_week',\n    'data'=\u003e [7, 8, 5, 6, 9, 7, 8],\n  }\n)\n```\n\nIf the resource is valid it will be added to the `resources` array of the Data Package;\nif it's invalid it will not be added and you should try creating and [validating](#validating-a-resource) your resource to see why it fails.\n\n```ruby\n# Update a resource\nmy_resource = package.get_resource('sleep_durations_this_week')\nmy_resource['schema'] = {\n  'fields'=\u003e [\n    {'name'=\u003e 'number_hours', 'type'=\u003e 'integer'},\n  ]\n}\n\n# Save the Data Package descriptor to the target file\npackage.save('datapackage.json')\n\n# Remove a resource\npackage.remove_resource('sleep_durations_this_week')\n```\n\n## Profiles\n\nData Package and Data Resource descriptors can be validated against  [JSON schemas](https://tools.ietf.org/html/draft-zyp-json-schema-04) that we call `profiles`.\n\nBy default, this gem uses the standard [Data Package profile](http://specs.frictionlessdata.io/schemas/data-package.json) and [Data Resource profile](http://specs.frictionlessdata.io/schemas/data-resource.json) but alternative profiles are available for both.\n\nAccording to the [specs](https://specs.frictionlessdata.io/profiles/) the value of\nthe `profile` property can be either a URL or an indentifier from [the registry](https://specs.frictionlessdata.io/schemas/registry.json).\n\n### Profiles in the local cache\n\nThe profiles from the registry come bundled with the gem. You can reference them in your Data Package descriptor by their identifier in [the registry](https://specs.frictionlessdata.io/schemas/registry.json):\n\n- `data-package` the default profile for a [Data Package](https://specs.frictionlessdata.io/data-package/)\n- `data-resource` the default profile for a [Data Resource](https://specs.frictionlessdata.io/data-resource)\n- `tabular-data-package` for a [Tabular Data Package](http://specs.frictionlessdata.io/tabular-data-package/)\n- `tabular-data-resource` for a [Tabular Data Resource](https://specs.frictionlessdata.io/tabular-data-resource/)\n- `fiscal-data-package` for a [Fiscal Data Package](http://fiscal.dataprotocols.org/spec/)\n\n```ruby\n{\n  \"profile\": \"tabular-data-package\"\n}\n```\n\n### Profiles from elsewhere\n\nIf you have a custom profile schema you can reference it by its URL:\n\n```ruby\n{\n  \"profile\": \"https://specs.frictionlessdata.io/schemas/tabular-data-package.json\"\n}\n```\n\n## Validation\n\nData Resources and Data Packages are validated against their profiles to ensure they respect the expected structure.\n\n### Validating a Resource\n\n```ruby\ndescriptor = {\n  'name'=\u003e 'incorrect name',\n  'path'=\u003e 'https://cdn.rawgit.com/frictionlessdata/datapackage-rb/master/spec/fixtures/test-pkg/test.csv',\n}\nresource = DataPackage::Resource.new(descriptor, base_path='')\n\n# Returns true if resource is valid, false otherwise\nresource.valid?\n\n# Returns true or raises DataPackage::ValidationError\nresource.validate\n\n# Iterate through validation errors\nresource.iter_errors{ |err| p err}\n```\n\n### Validating a Package\n\nThe same methods used to check the validity of a Resource - `valid?`, `validate` and `iter_errors`- are also available for a Package.\nThe difference is that after a Package descriptor is validated against its `profile`, each of its `resources` are also validated against their `profile`.\n\nIn order for a Package to be valid all its Resources have to be valid.\n\n## Developer notes\n\nThese notes are intended to help people that want to contribute to this package itself. If you just want to use it, you can safely ignore them.\n\nAfter checking out the repo, run `bundle` to install dependencies. Then, run `rake spec` to run the tests.\n\nTo install this gem onto your local machine, run `bundle exec rake install`.\nTo release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`,\nwhich will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).\n\n### Updating the local schemas cache\n\nWe cache the local schemas from https://specs.frictionlessdata.io/schemas/registry.json.\nThe local schemas should be kept up to date with the remote ones using:\n\n```\nrake update_profiles\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffrictionlessdata%2Fdatapackage-rb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffrictionlessdata%2Fdatapackage-rb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffrictionlessdata%2Fdatapackage-rb/lists"}