{"id":23421142,"url":"https://github.com/nhsdigital/ndr_import","last_synced_at":"2025-06-25T12:31:59.103Z","repository":{"id":35357940,"uuid":"39620614","full_name":"NHSDigital/ndr_import","owner":"NHSDigital","description":"National Disease Registers import ETL gem","archived":false,"fork":false,"pushed_at":"2025-06-11T09:36:34.000Z","size":2120,"stargazers_count":5,"open_issues_count":7,"forks_count":7,"subscribers_count":15,"default_branch":"main","last_synced_at":"2025-06-11T10:48:27.825Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/NHSDigital.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2015-07-24T08:28:04.000Z","updated_at":"2025-06-11T09:36:37.000Z","dependencies_parsed_at":"2023-10-11T12:43:18.196Z","dependency_job_id":"1d236f0f-2f7e-4803-ae8c-df71e5bdfe88","html_url":"https://github.com/NHSDigital/ndr_import","commit_stats":{"total_commits":409,"total_committers":22,"mean_commits":18.59090909090909,"dds":0.6136919315403423,"last_synced_commit":"2d59386848a88ec57cf0afaae21310a2787f4bb0"},"previous_names":[],"tags_count":82,"template":false,"template_full_name":null,"purl":"pkg:github/NHSDigital/ndr_import","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NHSDigital%2Fndr_import","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NHSDigital%2Fndr_import/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NHSDigital%2Fndr_import/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NHSDigital%2Fndr_import/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/NHSDigital","download_url":"https://codeload.github.com/NHSDigital/ndr_import/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NHSDigital%2Fndr_import/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261874614,"owners_count":23223150,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-23T02:14:09.040Z","updated_at":"2025-06-25T12:31:59.065Z","avatar_url":"https://github.com/NHSDigital.png","language":"Ruby","funding_links":[],"categories":[],"sub_categories":[],"readme":"# NdrImport [![Build Status](https://github.com/NHSDigital/ndr_import/workflows/Test/badge.svg)](https://github.com/NHSDigital/ndr_import/actions?query=workflow%3Atest) [![Gem Version](https://badge.fury.io/rb/ndr_import.svg)](https://rubygems.org/gems/ndr_import) [![Documentation](https://img.shields.io/badge/ndr_import-docs-blue.svg)](https://www.rubydoc.info/gems/ndr_import)\nThis is the NHS Digital (NHSD) National Disease Registers (NDR) Import ETL ruby gem, providing:\n\n1. file import handlers for *extracting* data from delimited files (csv, pipe, tab, thorn), JSON Lines, .xls(x) spreadsheets, .doc(x) word documents, PDF, PDF AcroForms, XML, 7-Zip, Zip, avro and VCF files.\n2. table mappers for *transforming* tabular and non-tabular data into key value pairs grouped by a common \"klass\".\n\n## Installation\n\nAdd this line to your application's Gemfile:\n\n```ruby\ngem 'ndr_import'\n```\n\nAnd then execute:\n\n    $ bundle\n\nOr install it yourself by cloning the project, then executing:\n\n    $ gem install ndr_import.gem\n\n## Usage\n\nBelow is an example that extracts data from a PDF and transforms it into to a collection of records defined by their \"klasses\" and \"fields\":\n\n```ruby\nrequire 'ndr_import/non_tabular/table'\nrequire 'ndr_import/file/registry'\n\nunzip_path = SafePath.new(...)\nsource_file = SafePath.new(...).join(...)\noptions = { 'unzip_path' =\u003e unzip_path }\n\ntable = NdrImport::NonTabular::Table.new(...)\n\n# Use the Registry to enumerate over the files and their tables\nfiles = NdrImport::File::Registry.files(source_file, options)\nfiles.each do |filename|\n  tables = NdrImport::File::Registry.tables(filename, nil, options)\n  tables.each do |_tablename, table_content|\n    # Use the NonTabular::Table to tabulate the table_content\n    table.transform(table_content).each do |_klass, _fields, _index|\n      # Your code goes here\n    end\n  end\nend\n```\n\nSee `test/readme_test.rb` for a more complete working example.\n\nMore information on the workings of the mapper are available in the [wiki](https://github.com/NHSDigital/ndr_import/wiki).\n\n## Development\n\nAfter checking out the repo, run `bin/setup` to install dependencies. Then, run `rake test` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.\n\nTo install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).\n\n## Contributing\n\n1. Fork it ( https://github.com/NHSDigital/ndr_import/fork )\n2. Create your feature branch (`git checkout -b my-new-feature`)\n3. Commit your changes (`git commit -am 'Add some feature'`)\n4. Push to the branch (`git push origin my-new-feature`)\n5. Create a new Pull Request\n\nPlease note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.\n\n## Test Data\n\nAll test data in this repository is fictitious. Any resemblance to real persons, living or dead, is purely coincidental although Mighty Boosh references have been used in some tests.\n\nNote: Real codes exist in the tests, postcodes for example, but bear no relation to real patient data. Please ensure that you *always* only ever commit dummy data when contributing to this project.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnhsdigital%2Fndr_import","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnhsdigital%2Fndr_import","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnhsdigital%2Fndr_import/lists"}