{"id":28512476,"url":"https://github.com/ncbi/datasets","last_synced_at":"2026-03-03T21:07:54.720Z","repository":{"id":37789772,"uuid":"253874799","full_name":"ncbi/datasets","owner":"ncbi","description":"NCBI Datasets is a new resource that lets you easily gather data from across NCBI databases.","archived":false,"fork":false,"pushed_at":"2026-02-20T14:06:54.000Z","size":22247,"stargazers_count":508,"open_issues_count":22,"forks_count":62,"subscribers_count":28,"default_branch":"master","last_synced_at":"2026-02-20T18:10:57.697Z","etag":null,"topics":["genomics-data","ncbi"],"latest_commit_sha":null,"homepage":"https://www.ncbi.nlm.nih.gov/datasets","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ncbi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2020-04-07T18:09:28.000Z","updated_at":"2026-02-20T12:58:37.000Z","dependencies_parsed_at":"2023-02-19T12:45:22.867Z","dependency_job_id":"aedbdf76-284c-4d36-8654-61a4fc395018","html_url":"https://github.com/ncbi/datasets","commit_stats":{"total_commits":295,"total_committers":16,"mean_commits":18.4375,"dds":0.7491525423728813,"last_synced_commit":"95ac7295736d26df77a86c29c03e84624275dea1"},"previous_names":[],"tags_count":222,"template":false,"template_full_name":null,"purl":"pkg:github/ncbi/datasets","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ncbi%2Fdatasets","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ncbi%2Fdatasets/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ncbi%2Fdatasets/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ncbi%2Fdatasets/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ncbi","download_url":"https://codeload.github.com/ncbi/datasets/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ncbi%2Fdatasets/sbom","scorecard":{"id":414548,"data":{"date":"2025-08-11","repo":{"name":"github.com/ncbi/datasets","commit":"8c70318899603ea4f052244426772dd6c60e7e8a"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3.8,"checks":[{"name":"Maintained","score":10,"reason":"16 commit(s) and 23 issue activity found in the last 90 days -- score normalized to 10","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Code-Review","score":0,"reason":"Found 0/15 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":9,"reason":"license file detected","details":["Info: project has a license file: LICENSE.md:0","Warn: project license file does not contain an FSF or OSI license."],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":0,"reason":"Project has not signed or included provenance with any releases.","details":["Warn: release artifact v18.5.2 not signed: https://api.github.com/repos/ncbi/datasets/releases/240081818","Warn: release artifact v18.5.1 not signed: https://api.github.com/repos/ncbi/datasets/releases/236889183","Warn: release artifact v18.5.0 not signed: https://api.github.com/repos/ncbi/datasets/releases/234254029","Warn: release artifact v18.4.1 not signed: https://api.github.com/repos/ncbi/datasets/releases/231253488","Warn: release artifact v18.4.0 not signed: https://api.github.com/repos/ncbi/datasets/releases/229775647","Warn: release artifact v18.5.2 does not have provenance: https://api.github.com/repos/ncbi/datasets/releases/240081818","Warn: release artifact v18.5.1 does not have provenance: https://api.github.com/repos/ncbi/datasets/releases/236889183","Warn: release artifact v18.5.0 does not have provenance: https://api.github.com/repos/ncbi/datasets/releases/234254029","Warn: release artifact v18.4.1 does not have provenance: https://api.github.com/repos/ncbi/datasets/releases/231253488","Warn: release artifact v18.4.0 does not have provenance: https://api.github.com/repos/ncbi/datasets/releases/229775647"],"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 30 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-18T23:29:22.440Z","repository_id":37789772,"created_at":"2025-08-18T23:29:22.440Z","updated_at":"2025-08-18T23:29:22.440Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30060982,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-03T18:21:05.932Z","status":"ssl_error","status_checked_at":"2026-03-03T18:20:59.341Z","response_time":61,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["genomics-data","ncbi"],"created_at":"2025-06-09T00:38:07.430Z","updated_at":"2026-03-03T21:07:54.703Z","avatar_url":"https://github.com/ncbi.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# NCBI Datasets\n\nNCBI Datasets is a resource that lets you easily gather data from across NCBI databases. You can use it to find and download sequence, annotation, and metadata for genes and genomes using our command-line interface (CLI) tools or [NCBI Datasets](https://www.ncbi.nlm.nih.gov/datasets/) web interface.\n\nNCBI Datasets tools are under active development. To submit feedback, please create a [GitHub issue](https://github.com/ncbi/datasets/issues/new/choose) or [contact NCBI](mailto:info@ncbi.nlm.nih.gov) directly with your questions, comments or feature requests.\n\n## Install the NCBI Datasets command-line tools\n\n[![Anaconda.org badge](https://anaconda.org/conda-forge/ncbi-datasets-cli/badges/version.svg)](https://anaconda.org/conda-forge/ncbi-datasets-cli)\n[![Platforms badge](https://anaconda.org/conda-forge/ncbi-datasets-cli/badges/platforms.svg)](https://anaconda.org/conda-forge/ncbi-datasets-cli)\n[![Total downloads badge](https://anaconda.org/conda-forge/ncbi-datasets-cli/badges/downloads.svg)](https://anaconda.org/conda-forge/ncbi-datasets-cli)\n\nInstall the latest version (CLI v16.x) of the NCBI Datasets CLI tools, *datasets* and *dataformat*, using conda:\n\n`conda install -c conda-forge ncbi-datasets-cli`\n\nFor other installation options, see our CLI tools [download and install](https://www.ncbi.nlm.nih.gov/datasets/docs/download-and-install/) instructions.\n\n## Use the NCBI Datasets command-line tools\n\nUse *datasets* to download biological sequence data across all domains of life from NCBI.\n\nUse *dataformat* to convert metadata included as part of the data package from JSON Lines format to other formats.\n\n### Examples:\nUse *datasets* to download a genome data package for the human reference genome GRCh38:\n\n`datasets download genome taxon human --reference --filename human-reference.zip`\n\nUse *dataformat* to extract selected fields of metadata from the downloaded data package for the human reference genome, GRCh38:\n```\ndataformat tsv genome --package human-reference.zip --fields organism-name,assminfo-name,accession,assminfo-submitter\nOrganism name\tAssembly Name\tAssembly Accession\tAssembly Submitter\nHomo sapiens\tGRCh38.p14\tGCF_000001405.40\tGenome Reference Consortium\n```\n\nThe Datasets CLI schematic below also outlines the available commands for the *datasets* CLI.\n![Datasets CLI schematic](https://www.ncbi.nlm.nih.gov/datasets/docs/v2/datasets_schema_taxonomy.png)\n\n### Download large numbers of genomes\n\nDownload large numbers of genomes by first downloading a dehydrated zip archive and then accessing the data in three steps.\n\n1. Download the dehydrated zip archive\n1. Unzip the downloaded zip archive\n1. Rehydrate to access the data\n\n\nTry this example for the human reference genome:\n\n1. Download the dehydrated zip archive:\n`datasets download genome accession GCF_000001405.40 --dehydrated --filename human_GRCh38_dataset.zip`\n\n1. Unzip the downloaded zip archive:\n`unzip human_GRCh38_dataset.zip -d my_human_dataset`\n\n1. Rehydrate to access the data:\n`datasets rehydrate --directory my_human_dataset/`\n\nFor more information, see [how to download large genome data packages](https://www.ncbi.nlm.nih.gov/datasets/docs/how-tos/genomes/large-download/).\n\n### Use your API key with the NCBI Datasets command-line tools\nNCBI Datasets API and command-line tool requests are rate-limited. By default, this rate limit is set at 5 requests per second (rps). By using your API key, you can increase this rate limit to 10 rps. For more information, see our documentation on [how to get an API key](https://www.ncbi.nlm.nih.gov/datasets/docs/v2/api/api-keys/#get-your-api-key) and [how to use your API key.](https://www.ncbi.nlm.nih.gov/datasets/docs/v2/api/api-keys/#use-your-api-key-with-the-ncbi-datasets-command-line-tools)\n\n## NCBI Datasets data packages\nNCBI Datasets provides sequence, annotation, metadata and other biological data as [NCBI Datasets Data Package zip archives](https://www.ncbi.nlm.nih.gov/datasets/docs/v2/reference-docs/data-packages/).\n\nWe currently offer four types of data package:\n1. An [NCBI Datasets Gene Data Package](https://www.ncbi.nlm.nih.gov/datasets/docs/v2/reference-docs/data-packages/gene-package/)\n1. An [NCBI Datasets Genome Data Package](https://www.ncbi.nlm.nih.gov/datasets/docs/v2/reference-docs/data-packages/genome/)\n1. A specialized [NCBI Datasets Virus Data Package](https://www.ncbi.nlm.nih.gov/datasets/docs/v2/reference-docs/data-packages/virus-genome/).\n1. An [NCBI Datasets Taxonomy Data Package](https://www.ncbi.nlm.nih.gov/datasets/docs/v2/reference-docs/data-packages/taxonomy/)\n\n## NCBI Datasets data reports\nNCBI Datasets data packages include data report files that contain metadata about the requested records. [Data report schemas](https://www.ncbi.nlm.nih.gov/datasets/docs/reference-docs/data-reports/) describe each type of data report, including available fields, with descriptions and examples.\n\n## Citing NCBI Datasets\n### Exploring and retrieving sequence and metadata for species across the tree of life with NCBI Datasets\n\nO'Leary NA, Cox E, Holmes JB, Anderson WR, Falk R, Hem V, Tsuchiya MTN, Schuler GD, Zhang X, Torcivia J, Ketter A, Breen L, Cothran J, Bajwa H, Tinne J, Meric PA, Hlavina W, Schneider VA. [Exploring and retrieving sequence and metadata for species across the tree of life with NCBI Datasets.](https://www.nature.com/articles/s41597-024-03571-y) Sci Data. 2024 Jul 5;11(1):732. doi: 10.1038/s41597-024-03571-y. PMID: 38969627; PMCID: PMC11226681.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fncbi%2Fdatasets","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fncbi%2Fdatasets","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fncbi%2Fdatasets/lists"}