{"id":34042944,"url":"https://github.com/john-sandall/maven","last_synced_at":"2026-03-27T02:44:51.986Z","repository":{"id":35004828,"uuid":"195657149","full_name":"john-sandall/maven","owner":"john-sandall","description":"Maven provides easy access to open datasets in both raw and model-ready formats.","archived":false,"fork":false,"pushed_at":"2022-12-08T05:51:25.000Z","size":188,"stargazers_count":10,"open_issues_count":13,"forks_count":6,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-10-28T19:58:53.404Z","etag":null,"topics":["data","etl","maven","open","pipeline","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/john-sandall.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-07-07T13:53:34.000Z","updated_at":"2023-04-27T15:38:01.000Z","dependencies_parsed_at":"2023-01-15T11:51:24.159Z","dependency_job_id":null,"html_url":"https://github.com/john-sandall/maven","commit_stats":null,"previous_names":[],"tags_count":13,"template":false,"template_full_name":null,"purl":"pkg:github/john-sandall/maven","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/john-sandall%2Fmaven","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/john-sandall%2Fmaven/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/john-sandall%2Fmaven/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/john-sandall%2Fmaven/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/john-sandall","download_url":"https://codeload.github.com/john-sandall/maven/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/john-sandall%2Fmaven/sbom","scorecard":{"id":528123,"data":{"date":"2025-08-11","repo":{"name":"github.com/john-sandall/maven","commit":"8cedaae887f2a0beca968f9652805445abf160f5"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":1.7,"checks":[{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Code-Review","score":0,"reason":"Found 0/9 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: Apache License 2.0: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Vulnerabilities","score":0,"reason":"25 existing vulnerabilities detected","details":["Warn: Project is vulnerable to: GHSA-29gw-9793-fvw7","Warn: Project is vulnerable to: PYSEC-2020-92 / GHSA-hj5v-574p-mj7c","Warn: Project is vulnerable to: PYSEC-2022-42969","Warn: Project is vulnerable to: PYSEC-2021-140 / GHSA-9w8r-397f-prfh","Warn: Project is vulnerable to: PYSEC-2023-117 / GHSA-mrwq-x4v8-fh7p","Warn: Project is vulnerable to: PYSEC-2021-141 / GHSA-pq64-v7f5-gqh8","Warn: Project is vulnerable to: GHSA-jfmj-5v4g-7637","Warn: Project is vulnerable to: PYSEC-2022-42986 / GHSA-43fp-rhv2-5gv8","Warn: Project is vulnerable to: PYSEC-2023-135 / GHSA-xqr8-7jwr-rhp7","Warn: Project is vulnerable to: PYSEC-2024-60 / GHSA-jjg7-2v4v-x38h","Warn: Project is vulnerable to: PYSEC-2021-856 / GHSA-5545-2q6w-2gh6","Warn: Project is vulnerable to: GHSA-6p56-wp2h-9hxr","Warn: Project is vulnerable to: PYSEC-2021-857 / GHSA-f7c7-j99h-c22f","Warn: Project is vulnerable to: GHSA-fpfv-jqm9-f5jm","Warn: Project is vulnerable to: PYSEC-2020-73","Warn: Project is vulnerable to: GHSA-9hjg-9r4m-mvj7","Warn: Project is vulnerable to: GHSA-9wx4-h78v-vm56","Warn: Project is vulnerable to: PYSEC-2023-74 / GHSA-j8r2-6x86-q33q","Warn: Project is vulnerable to: GHSA-34jh-p97f-mpxf","Warn: Project is vulnerable to: PYSEC-2023-212 / GHSA-g4mx-q9vg-27p4","Warn: Project is vulnerable to: PYSEC-2020-149 / GHSA-hmv2-79q8-fv6g","Warn: Project is vulnerable to: GHSA-pq67-6m6q-mj2v","Warn: Project is vulnerable to: PYSEC-2023-192 / GHSA-v845-jxx5-vc9f","Warn: Project is vulnerable to: PYSEC-2020-148 / GHSA-wqvq-5m8c-6g24","Warn: Project is vulnerable to: PYSEC-2021-108"],"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 27 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-20T04:58:49.203Z","repository_id":35004828,"created_at":"2025-08-20T04:58:49.203Z","updated_at":"2025-08-20T04:58:49.203Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31011921,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-27T02:33:22.146Z","status":"ssl_error","status_checked_at":"2026-03-27T02:33:21.763Z","response_time":164,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data","etl","maven","open","pipeline","python"],"created_at":"2025-12-13T22:52:25.222Z","updated_at":"2026-03-27T02:44:51.978Z","avatar_url":"https://github.com/john-sandall.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Maven\n\u003e /meɪvən/ – a trusted expert who seeks to pass timely and relevant knowledge on to others.\n\nMaven's goal is to reduce the time data scientists spend on data cleaning and preparation by providing easy access to open datasets in both raw and processed formats.\n\nMaven was built to:\n\n- **Improve availability and integrity of open data** by eliminating data issues, adding common identifiers, and reshaping data to become model-ready.\n- **Source data in its rawest form** from the most authoritative data provider available with all transformations available as open source code to enhance integrity and trust.\n- **Honour data licences wherever possible** whilst avoiding potential issues relating to re-distribution of data (especially open datasets where no clear licence is provided) by performing all data retrieval and processing on-device.\n\n\n## Install\n```\npip install maven\n```\n\n\n## Usage\n```python\nimport maven\nmaven.get('general-election/UK/2017/results', data_directory='./data/')\n```\n\n\n## Datasets\nData dictionaries for all datasets are available by clicking on the dataset's name.\n\n| Dataset | Description | Date | Source | Licence |\n| -- | -- | -- | -- | -- |\n| **Coronavirus Datasets** |\n| [**`coronavirus/CSSE`**](https://github.com/john-sandall/maven/tree/master/maven/datasets/coronavirus) | Daily CSSE cases/deaths/recovered by country/region/state | Updated daily | [Johns Hopkins Center for Systems Science and Engineering](https://github.com/CSSEGISandData/COVID-19/) | [See \"Terms of Use\" on CSSE repo](https://github.com/CSSEGISandData/COVID-19/) |\n| **UK Political Datasets** |\n| [**`general-election/UK/2010/results`**](https://github.com/john-sandall/maven/tree/master/maven/datasets/general_election) | UK 2010 General Election results | 6th May 2010 | [House of Commons Library](https://researchbriefings.parliament.uk/ResearchBriefing/Summary/CBP-8647) | [Open Parliament Licence v3.0](https://www.parliament.uk/site-information/copyright-parliament/open-parliament-licence/) |\n| [**`general-election/UK/2015/results`**](https://github.com/john-sandall/maven/tree/master/maven/datasets/general_election) | UK 2015 General Election results | 7th May 2015 | [House of Commons Library](https://researchbriefings.parliament.uk/ResearchBriefing/Summary/CBP-8647) | [Open Parliament Licence v3.0](https://www.parliament.uk/site-information/copyright-parliament/open-parliament-licence/) |\n| [**`general-election/UK/2017/results`**](https://github.com/john-sandall/maven/tree/master/maven/datasets/general_election) | UK 2017 General Election results | 8th June 2017 | [House of Commons Library](https://researchbriefings.parliament.uk/ResearchBriefing/Summary/CBP-8647) | [Open Parliament Licence v3.0](https://www.parliament.uk/site-information/copyright-parliament/open-parliament-licence/) |\n| [**`general-election/UK/2015/model`**](https://github.com/john-sandall/maven/tree/master/maven/datasets/general_election) | Model-ready datasets for forecasting the 2015 UK General Election | 2010 \u0026 2015 data | [uk_2015_model.py](https://github.com/john-sandall/maven/blob/master/maven/datasets/general_election/uk_2015_model.py) | Mixed |\n| [**`general-election/UK/2017/model`**](https://github.com/john-sandall/maven/tree/master/maven/datasets/general_election) | Model-ready datasets for forecasting the 2017 UK General Election | 2015 \u0026 2017 data | [uk_2017_model.py](https://github.com/john-sandall/maven/blob/master/maven/datasets/general_election/uk_2017_model.py) | Mixed |\n| [**`general-election/UK/polls`**](https://github.com/john-sandall/maven/tree/master/maven/datasets/general_election) | UK General Election opinion polling | May 2005 - June 2017 | [SixFifty](https://github.com/six50/pipeline/tree/master/data/polls/) | Unknown |\n\n\n\n## Running tests\nTo run tests against an installed version (either `pip install .` or `pip install maven`):\n```\n$ cd /path/to/repo\n$ pytest\n```\n\nTo run tests whilst in development:\n```\n$ cd /path/to/repo\n$ python -m pytest\n```\n\n\n## Licences\n| Name | Description | Attribution Statement |\n| -- | -- | -- |\n| [Open Parliament Licence](http://www.parliament.uk/site-information/copyright/open-parliament-licence/) | Free to copy, publish, distribute, transmit, adapt and exploit commercially or non-commercially. See URL for full details. | Contains Parliamentary information licensed under the Open Parliament Licence v3.0. |\n| [Open Government Licence](http://www.nationalarchives.gov.uk/doc/open-government-licence/version/2/) | Free to copy, publish, distribute, transmit, adapt and exploit commercially and non-commercially. See URL for full details. | Contains public sector information licensed under the Open Government Licence v2.0. |\n\n\n## Contributing\nMaven was designed for your contributions!\n\n1. Check for open issues or open a fresh issue to start a discussion around your idea or a bug.\n2. Fork [the repository](https://github.com/john-sandall/maven) on GitHub to start making your changes to the master branch (or branch off of it).\n3. For new datasets ensure the processed dataset is fully documented with a data dictionary. For new features and bugs, please write a test which shows that the bug was fixed or that the feature works as expected.\n4. Send a [pull request](https://help.github.com/en/articles/creating-a-pull-request-from-a-fork) and bug the maintainer until it gets merged and published. 😄\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjohn-sandall%2Fmaven","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjohn-sandall%2Fmaven","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjohn-sandall%2Fmaven/lists"}