{"id":40505357,"url":"https://github.com/louiejtaylor/grabseqs","last_synced_at":"2026-01-20T19:33:13.964Z","repository":{"id":41432183,"uuid":"154326448","full_name":"louiejtaylor/grabseqs","owner":"louiejtaylor","description":"A utility for easy downloading of reads from next-gen sequencing repositories like NCBI SRA","archived":false,"fork":false,"pushed_at":"2025-08-18T19:07:09.000Z","size":267,"stargazers_count":107,"open_issues_count":5,"forks_count":14,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-09-28T03:04:49.302Z","etag":null,"topics":["bioinformatics","conda","metagenomics","ncbi-sra","ngs","python","sra"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/louiejtaylor.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2018-10-23T12:44:49.000Z","updated_at":"2025-09-09T09:13:46.000Z","dependencies_parsed_at":"2025-09-08T13:51:10.166Z","dependency_job_id":"f2d8e821-ce59-4afa-bccb-bfa751f1f84b","html_url":"https://github.com/louiejtaylor/grabseqs","commit_stats":null,"previous_names":[],"tags_count":22,"template":false,"template_full_name":null,"purl":"pkg:github/louiejtaylor/grabseqs","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/louiejtaylor%2Fgrabseqs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/louiejtaylor%2Fgrabseqs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/louiejtaylor%2Fgrabseqs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/louiejtaylor%2Fgrabseqs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/louiejtaylor","download_url":"https://codeload.github.com/louiejtaylor/grabseqs/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/louiejtaylor%2Fgrabseqs/sbom","scorecard":{"id":599621,"data":{"date":"2025-08-11","repo":{"name":"github.com/louiejtaylor/grabseqs","commit":"da50ad06b25b823137275a81d07445412040f9c0"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":4,"checks":[{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Maintained","score":10,"reason":"29 commit(s) and 3 issue activity found in the last 90 days -- score normalized to 10","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"SAST","score":0,"reason":"no SAST tool detected","details":["Warn: no pull requests merged into dev branch"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}},{"name":"Code-Review","score":0,"reason":"Found 0/30 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Pinned-Dependencies","score":0,"reason":"dependency not pinned by hash detected -- score normalized to 0","details":["Warn: downloadThenRun not pinned by hash: .circleci/setup.sh:7","Warn: pipCommand not pinned by hash: tests/run_tests.bash:81","Warn: pipCommand not pinned by hash: tests/run_tests.bash:89","Info:   0 out of   2 pipCommand dependencies pinned","Info:   0 out of   1 downloadThenRun dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: MIT License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}}]},"last_synced_at":"2025-08-21T00:05:23.683Z","repository_id":41432183,"created_at":"2025-08-21T00:05:23.683Z","updated_at":"2025-08-21T00:05:23.683Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28610642,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-20T18:56:40.769Z","status":"ssl_error","status_checked_at":"2026-01-20T18:54:26.653Z","response_time":117,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","conda","metagenomics","ncbi-sra","ngs","python","sra"],"created_at":"2026-01-20T19:33:13.375Z","updated_at":"2026-01-20T19:33:13.957Z","avatar_url":"https://github.com/louiejtaylor.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# grabseqs\n\nUtility for simplifying bulk downloading data from next-generation sequencing repositories, like [NCBI SRA](https://www.ncbi.nlm.nih.gov/sra/), [MG-RAST](http://www.mg-rast.org/).\n\n[![CircleCI](https://circleci.com/gh/louiejtaylor/grabseqs.svg?style=shield)](https://circleci.com/gh/louiejtaylor/grabseqs) [![Conda version](https://anaconda.org/louiejtaylor/grabseqs/badges/version.svg)](https://anaconda.org/louiejtaylor/grabseqs) [![Conda downloads](https://anaconda.org/louiejtaylor/grabseqs/badges/downloads.svg)](https://anaconda.org/louiejtaylor/grabseqs/files) [![Paper link](https://img.shields.io/badge/Published%20in-Bioinformatics-126888.svg)](https://doi.org/10.1093/bioinformatics/btaa167)\n\n[iMicrobe](https://www.imicrobe.us/) is currently not supported--working to remedy this (2025/08/14)\n\n## Install\n\nInstall grabseqs and all dependencies [via conda](https://conda.io/projects/conda/en/latest/user-guide/getting-started.html):\n\n    conda install grabseqs -c louiejtaylor -c bioconda -c conda-forge\n\nOr with pip (and install the non-Python [dependencies](https://github.com/louiejtaylor/grabseqs#dependencies) yourself):\n\n    pip install grabseqs\n    \n**Note:** If you're using SRA data, after you've installed sra-tools, run `vdb-config -i` and turn off local file caching unless you want extra copies of the downloaded sequences taking up space ([read more here](https://github.com/ncbi/sra-tools/wiki/Toolkit-Configuration)).\n\n## Quick start\n\nDownload all samples from a single SRA Project:\n\n    grabseqs sra SRP#######\n    \nOr any combination of projects (S/ERP), runs (S/ERR), BioProjects (PRJNA):\n\n    grabseqs sra SRR######## ERP####### PRJNA######## ERR########\n\nIf you'd like to do a dry run and just get a list of samples that will be downloaded, pass `-l`:\n    \n    grabseqs sra -l SRP########\n\nSimilar syntax works for MG-RAST:\n\n    grabseqs mgrast mgp##### mgm#######\n\n## Detailed usage\n\nSee the [grabseqs FAQ](https://github.com/louiejtaylor/grabseqs/blob/master/faq/faq.md) for detailed troubleshooting tips.\n\nFun options:\n\n    grabseqs sra -t 10 -m metadata.csv -o proj/ -r 3 SRP#######\n\n(translation: use 10 threads, save metadata to `proj/metadata.csv`, download to the dir `proj/`, retry failed downloads 3x, get all samples from SRP#######)\n    \nIf you'd like to do a dry run and only get a list of samples that will be downloaded, pass `-l`:\n    \n    grabseqs sra -l SRP########\n\nIf you'd like to pass your own arguments to `fasterq-dump` to get data in a slightly different format, you can do so like this:\n\n    grabseqs sra SRP####### -r 0 --custom_fqdump_args=\"--split-spot --progress\"\n\nFull usage:\n\n    grabseqs sra [-h] [-m METADATA] [-o OUTDIR] [-r RETRIES] [-t THREADS]\n                 [-f] [-l] [--no_parsing] [--parse_run_ids]\n                 [--use_fastq_dump]\n                 id [id ...]\n\n    positional arguments:\n      id                One or more BioProject, ERR/SRR or ERP/SRP number(s)\n\n    optional arguments:\n      -h, --help        show this help message and exit\n      -m METADATA       filename in which to save SRA metadata (.csv format,\n                        relative to OUTDIR)\n      -o OUTDIR         directory in which to save output. created if it doesn't\n                        exist\n      -r RETRIES        number of times to retry download\n      -t THREADS        threads to use (for fasterq-dump/pigz)\n      -f                force re-download of files\n      -l                list (but do not download) samples to be grabbed\n      --parse_run_ids   parse SRR/ERR identifers (do not pass straight to fasterq-\n                        dump)\n      --custom_fqdump_args CUSTOM_FQD_ARGS\n                        \"string\" containing args to pass to fastq-dump\n      --use_fastq_dump  use legacy fastq-dump instead of fasterq-dump (no\n                        multithreaded downloading)\n      \nDownloads .fastq.gz files to `OUTDIR` (or the working directory if not specified). If the `-m` flag is passed, saves metadata to `OUTDIR` with filename `METADATA` in csv format.\n\nSimilar options are available for downloading from MG-RAST:\n\n    grabseqs mgrast [-h] [-m METADATA] [-o OUTDIR] [-r RETRIES]\n                    [-t THREADS] [-f] [-l]\n                    rastid [rastid ...]\n\n## Troubleshooting\n\nSee the [grabseqs FAQ](https://github.com/louiejtaylor/grabseqs/blob/master/faq/faq.md) for detailed troubleshooting tips. If the FAQs don't fix your problem, feel free to [open an issue](https://github.com/louiejtaylor/grabseqs/issues)!\n\n## Dependencies\n\n   - Python 3 (external packages req'd: requests, requests-html, pandas, fake-useragent)\n   - sra-tools\u003e3.2\n   - pigz\n   - wget\n\nIf you use conda (on Linux), these will be installed for you!\n\nGrabseqs runs on Mac or Linux. We've tested on these specific OSes:\n\nLinux (conda or pip):\n  - CentOS 6, 7, and 8\n  - Debian 9 and 10\n  - Ubuntu 16.04, 18.04, and 19.10\n  - Red Hat Enterprise 6, 7, and 8\n  - SUSE Enterprise 12 and 15\n\nMac (pip):\n  - MacOS 10.14\n\nGrabseqs has been tested and works with the following version of the Python dependencies (though these are neither minimal nor pinned version numbers):\n   \n   - requests 2.22.0\n   - pandas\u003e2\n\n## Citation\n\nIf you use grabseqs in your work, please cite:\n\nLouis J Taylor, Arwa Abbas, Frederic D Bushman. \"grabseqs: Simple downloading of reads and metadata from multiple next-generation sequencing data repositories.\" *Bioinformatics*, (2020), btaa167, https://doi.org/10.1093/bioinformatics/btaa167\n\nPlease also cite the researchers who generated the data (and the repository, if appropriate)!\n\n------------\n\n## Changelog\n\n**1.0.0** (2025-08-14)\n - Added a walk-through for adding a new repo using `template.py`\n - Better handling for invalid SRA accession numbers\n - Update endpoint for NCBI for SRA downloads\n - Temporarily remove iMicrobe--needs rewrite to use a different tool\n\n**0.7.0** (2020-01-29)\n - Allow users to pass custom args to fast(er)q-dump\n - Minor re-writes of download handling code for easier readability\n\n**0.6.1** (2019-12-20)\n - Validate compressed files (fix #8 and #34)\n \n**0.6.0** (2019-12-12)\n - Gracefully handle incomplete or missing dependencies\n - Major rewrite of test suite\n\n**0.5.2** (2019-12-05)\n - Improvements to work with multiple versions of Python 3\n\n**0.5.1** (2019-11-23)\n - Hotfix handling outdated versions of sra-tools\n\n**0.5.0** (2019-04-11)\n - Metadata available for all sources in .csv format\n\n## History\n\nThis project spawned out of/incorporates code from [hisss](https://github.com/louiejtaylor/hisss); many thanks to [ArwaAbbas](https://github.com/ArwaAbbas) for helping make this work!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flouiejtaylor%2Fgrabseqs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flouiejtaylor%2Fgrabseqs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flouiejtaylor%2Fgrabseqs/lists"}