{"id":26005224,"url":"https://github.com/samhollings/nhs_data_cleansing","last_synced_at":"2025-10-05T12:59:59.447Z","repository":{"id":278423230,"uuid":"912892252","full_name":"SamHollings/nhs_data_cleansing","owner":"SamHollings","description":"A repo of reusable functions for cleansing data","archived":false,"fork":false,"pushed_at":"2025-02-26T20:57:18.000Z","size":54,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-26T21:28:36.294Z","etag":null,"topics":["cleansing","data","data-cleaning","data-cleansing","preprocessing","pyspark","python","python3"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SamHollings.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-06T15:55:41.000Z","updated_at":"2025-02-26T20:54:22.000Z","dependencies_parsed_at":"2025-02-19T18:33:15.424Z","dependency_job_id":null,"html_url":"https://github.com/SamHollings/nhs_data_cleansing","commit_stats":null,"previous_names":["samhollings/nhs_data_cleansing"],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/SamHollings/nhs_data_cleansing","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SamHollings%2Fnhs_data_cleansing","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SamHollings%2Fnhs_data_cleansing/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SamHollings%2Fnhs_data_cleansing/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SamHollings%2Fnhs_data_cleansing/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SamHollings","download_url":"https://codeload.github.com/SamHollings/nhs_data_cleansing/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SamHollings%2Fnhs_data_cleansing/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278457468,"owners_count":25989956,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-05T02:00:06.059Z","response_time":54,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cleansing","data","data-cleaning","data-cleansing","preprocessing","pyspark","python","python3"],"created_at":"2025-03-05T20:46:55.906Z","updated_at":"2025-10-05T12:59:59.431Z","avatar_url":"https://github.com/SamHollings.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# nhs_data_cleansing\n[![CI](https://github.com/SamHollings/nhs_data_cleansing/actions/workflows/main.yml/badge.svg)](https://github.com/SamHollings/nhs_data_cleansing/actions/workflows/main.yml) ![Static Badge](https://img.shields.io/badge/status-development-blue) [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n## Description\nThis repo builds the `nhs_data_cleansing` python package, which contains generic Python functions (specifically using the PySpark library and data structures) for data cleansing. \n\nThe functions can be seen in [`src`](src).\n\nToDo: Add sphinx documentation (or something similar, automatically built)\n\n## Instalation\n```bash\npip install nhs_data_cleansing\n```\n\n## Usage\nGenerally, simply add `nhs_data_cleansing` to your list of dependencies/requirements, then install the package.\n\n\u003e [!NOTE]\n\u003e It's best practice to specify a version of the library in your list of dependencies - then when the package is updated, your existing work will not be affected.\n\u003e The verion numbers may need to be updated in the future, particularly if you want to use newer functionality.\n\n### pip\nAdd `nhs_data_cleansing` to a `requirements.txt` file within the project, and then do `pip install -r requirements.txt`\n\n### Foundry\nAdd `nhs_data_cleansing` to the `conda_recipe/meta.yml` file following the [Foundry \"python libraries\" guidance](https://www.palantir.com/docs/foundry/transforms-python/use-python-libraries)\n\n## Contact\n\u003cadd contact email address\u003e\n\n## Licence\nUnless stated otherwise (and in keeping with the [NHS Open Source Policy](https://github.com/nhsx/open-source-policy/blob/main/open-source-policy.md#b-readmes)), the codebase is released under the MIT License. This covers both the codebase and any sample code in the documentation. The documentation is © Crown copyright and available under the terms of the Open Government 3.0 licence.\n\n## Contribution\nIf you want to help build and improve this package, see the [contributing guidelines](CONTRIBUTE.md) \n\n---\nThis readme has neem built in line with guidance from the [NHS Open Source Policy](https://github.com/nhsx/open-source-policy/blob/main/open-source-policy.md#b-readmes) and [govtcookiecutter](https://github.com/best-practice-and-impact/govcookiecutter/tree/main)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsamhollings%2Fnhs_data_cleansing","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsamhollings%2Fnhs_data_cleansing","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsamhollings%2Fnhs_data_cleansing/lists"}