{"id":13639329,"url":"https://github.com/shenwei356/taxonkit","last_synced_at":"2025-12-29T23:40:48.967Z","repository":{"id":12689324,"uuid":"72552525","full_name":"shenwei356/taxonkit","owner":"shenwei356","description":"A Practical and Efficient NCBI Taxonomy Toolkit, also supports creating NCBI-style taxdump files for custom taxonomies like GTDB/ICTV","archived":false,"fork":false,"pushed_at":"2024-11-08T00:43:22.000Z","size":14982,"stargazers_count":378,"open_issues_count":8,"forks_count":30,"subscribers_count":9,"default_branch":"master","last_synced_at":"2024-11-08T01:35:30.971Z","etag":null,"topics":["bioinformatics","cross-platform","lca","lineage","taxdump","taxid","taxonkit","taxonomy"],"latest_commit_sha":null,"homepage":"https://bioinf.shenwei.me/taxonkit","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/shenwei356.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-11-01T16:02:46.000Z","updated_at":"2024-11-08T00:43:25.000Z","dependencies_parsed_at":"2023-01-13T17:05:13.740Z","dependency_job_id":"463e0c8e-c52e-4d7a-bf75-956fc33e671e","html_url":"https://github.com/shenwei356/taxonkit","commit_stats":{"total_commits":251,"total_committers":3,"mean_commits":83.66666666666667,"dds":0.007968127490039834,"last_synced_commit":"f5505e830313228bf78dec51f95c87545aec1eab"},"previous_names":[],"tags_count":54,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shenwei356%2Ftaxonkit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shenwei356%2Ftaxonkit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shenwei356%2Ftaxonkit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shenwei356%2Ftaxonkit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/shenwei356","download_url":"https://codeload.github.com/shenwei356/taxonkit/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223810367,"owners_count":17206751,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","cross-platform","lca","lineage","taxdump","taxid","taxonkit","taxonomy"],"created_at":"2024-08-02T01:00:59.653Z","updated_at":"2025-12-29T23:40:48.940Z","avatar_url":"https://github.com/shenwei356.png","language":"Go","funding_links":[],"categories":["Science and Data Analysis","Ranked by starred repositories","Data Manipulation and Querying"],"sub_categories":["HTTP Clients"],"readme":"# TaxonKit - A Practical and Efficient NCBI Taxonomy Toolkit\n\n- **Documents:** [https://bioinf.shenwei.me/taxonkit](https://bioinf.shenwei.me/taxonkit)\n([**Usage\u0026Examples**](https://bioinf.shenwei.me/taxonkit/usage/),\n[**Tutorial**](https://bioinf.shenwei.me/taxonkit/tutorial/), [**中文介绍**](https://bioinf.shenwei.me/taxonkit/chinese/))\n- **Source code:** [https://github.com/shenwei356/taxonkit](https://github.com/shenwei356/taxonkit)\n[![GitHub stars](https://img.shields.io/github/stars/shenwei356/taxonkit.svg?style=social\u0026label=Star\u0026?maxAge=2592000)](https://github.com/shenwei356/taxonkit)\n[![license](https://img.shields.io/github/license/shenwei356/taxonkit.svg?maxAge=2592000)](https://github.com/shenwei356/taxonkit/blob/master/LICENSE)\n[![Built with GoLang](https://img.shields.io/badge/powered_by-go-6362c2.svg?style=flat)](https://golang.org)\n- **Latest version:** [![Latest Version](https://img.shields.io/github/release/shenwei356/taxonkit.svg?style=flat?maxAge=86400)](https://github.com/shenwei356/taxonkit/releases)\n[![Github Releases](https://img.shields.io/github/downloads/shenwei356/taxonkit/latest/total.svg?maxAge=3600)](https://bioinf.shenwei.me/taxonkit/download/)\n[![Cross-platform](https://img.shields.io/badge/platform-any-ec2eb4.svg?style=flat)](https://bioinf.shenwei.me/taxonkit/download/)\n[![Anaconda Cloud](\thttps://anaconda.org/bioconda/taxonkit/badges/version.svg)](https://anaconda.org/bioconda/taxonkit)\n- **[Please cite](#citation):** [https://doi.org/10.1016/j.jgg.2021.03.006](https://www.sciencedirect.com/science/article/pii/S1673852721000837)\n[![Citation Badge](https://api.juleskreuer.eu/citation-badge.php?doi=10.1016/j.jgg.2021.03.006)](https://scholar.google.com/citations?view_op=view_citation\u0026hl=en\u0026user=wHF3Lm8AAAAJ\u0026citation_for_view=wHF3Lm8AAAAJ:ULOm3_A8WrAC)\n- [pytaxonkit](https://github.com/bioforensics/pytaxonkit), Python bindings for TaxonKit.\n\nRelated projects:\n\n- [**Taxid-Changelog**](https://github.com/shenwei356/taxid-changelog): Tracking all changes of TaxIds, including deletion, new adding, merge, reuse, and rank/name changes.\n- [GTDB taxdump](https://github.com/shenwei356/gtdb-taxdump): GTDB taxonomy taxdump files with trackable TaxIds.\n- [ICTV taxdump](https://github.com/shenwei356/ictv-taxdump): NCBI-style taxdump files for International Committee on Taxonomy of Viruses (ICTV)\n\n## Table of Contents\n\n\u003c!-- START doctoc generated TOC please keep comment here to allow auto update --\u003e\n\u003c!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE --\u003e\n\n- [Features](#features)\n- [Subcommands](#subcommands)\n- [Benchmark](#benchmark)\n- [Dataset](#dataset)\n- [Installation](#installation)\n- [Command-line completion](#command-line-completion)\n- [Citation](#citation)\n- [Contact](#contact)\n- [License](#license)\n\n\u003c!-- END doctoc generated TOC please keep comment here to allow auto update --\u003e\n\n\n## Features\n\n- **Easy to install** ([download](http://bioinf.shenwei.me/taxonkit/download/))\n    - Statically linked executable binaries for multiple platforms (Linux/Windows/macOS, amd64/arm64) \n    - Light weight and out-of-the-box, no dependencies, no compilation, no configuration\n    - No database building, just download [NCBI taxonomy data](https://ftp.ncbi.nih.gov/pub/taxonomy/taxdump.tar.gz) and uncompress to `$HOME/.taxonkit`\n- **Easy to use** ([usages and examples](http://bioinf.shenwei.me/taxonkit/usage/))\n    - Supporting [bash-completion](#bash-completion)\n    - Fast (see [benchmark](#benchmark)), multiple-CPUs supported, most operations cost 2-10s.\n    - Detailed usages and examples\n    - Supporting STDIN and (gzipped) input/output file, easily integrated in pipe\n- **Versatile commands** \n    - [Usage and examples](http://bioinf.shenwei.me/taxonkit/usage/)\n    - Featured command: [tracking monthly changelog of all TaxIds](https://github.com/shenwei356/taxid-changelog)\n    - Featured command: [reformating lineage into format of seven-level (\"superkingdom/kingdom, phylum, class, order, family, genus, species\"](https://bioinf.shenwei.me/taxonkit/usage/#reformat), and [even all possible ranks](https://bioinf.shenwei.me/taxonkit/usage/#reformat2)\n    - Featured command: [filtering taxiDs by a rank range](http://bioinf.shenwei.me/taxonkit/usage/#filter), e.g., at or below genus rank.\n    - Featured command: [**Create NCBI-style taxdump files for custom taxonomy, e.g., GTDB and ICTV**](https://bioinf.shenwei.me/taxonkit/usage/#create-taxdump)\n\n## Subcommands\n\nSubcommand                                                                    |Function\n:-----------------------------------------------------------------------------|:----------------------------------------------\n[`list`](https://bioinf.shenwei.me/taxonkit/usage/#list)                      |List taxonomic subtrees (TaxIds) bellow given TaxIds\n[`lineage`](https://bioinf.shenwei.me/taxonkit/usage/#lineage)                |Query taxonomic lineage of given TaxIds\n[`reformat`](https://bioinf.shenwei.me/taxonkit/usage/#reformat)              |Reformat lineage in canonical ranks\n[`reformat2`](https://bioinf.shenwei.me/taxonkit/usage/#reformat2)\u003csup\u003e*\u003c/sup\u003e|Reformat lineage in chosen ranks, allowing more ranks than 'reformat'\n[`name2taxid`](https://bioinf.shenwei.me/taxonkit/usage/#name2taxid)          |Convert taxon names to TaxIds\n[`filter`](https://bioinf.shenwei.me/taxonkit/usage/#filter)                  |Filter TaxIds by taxonomic rank range\n[`lca`](https://bioinf.shenwei.me/taxonkit/usage/#lca)                        |Compute lowest common ancestor (LCA) for TaxIds\n[`taxid-changelog`](https://bioinf.shenwei.me/taxonkit/usage/#taxid-changelog)|Create TaxId changelog from dump archives\n[`profile2cami`](https://bioinf.shenwei.me/taxonkit/usage/#profile2cami)\u003csup\u003e*\u003c/sup\u003e     |Convert metagenomic profile table to CAMI format \n[`cami-filter`](https://bioinf.shenwei.me/taxonkit/usage/#cami-filter)\u003csup\u003e*\u003c/sup\u003e        |Remove taxa of given TaxIds and their descendants in CAMI metagenomic profile\n[`create-taxdump`](https://bioinf.shenwei.me/taxonkit/usage/#create-taxdump)\u003csup\u003e*\u003c/sup\u003e  |Create NCBI-style taxdump files for custom taxonomy, e.g., GTDB and ICTV\n\nNote: \u003csup\u003e*\u003c/sup\u003eNew commands since the publication.\n\n\u003cimg src=\"taxonkit.jpg\" alt=\"taxonkit\" width=\"700\" align=\"center\" /\u003e\n\n## Benchmark\n\n1. Getting complete lineage for given TaxIds (this plot is very old).\n\n   \u003cimg src=\"bench/bench.get_lineage.reformat.tsv.png\" alt=\"\" width=\"600\" align=\"center\" /\u003e\n\n   Versions: ETE=3.1.2, taxopy=0.5.0 ([faster since 0.6.0](https://github.com/shenwei356/taxonkit/issues/47)), TaxonKit=0.7.2.\n\n## Dataset\n\n1. Download and uncompress `taxdump.tar.gz`: https://ftp.ncbi.nih.gov/pub/taxonomy/taxdump.tar.gz \n2. Copy `names.dmp`, `nodes.dmp`, `delnodes.dmp` and `merged.dmp` to data directory: `$HOME/.taxonkit`,\ne.g., `/home/shenwei/.taxonkit` ,\n3. Optionally copy to some other directories, and later you can refer to using flag `--data-dir`,\nor environment variable `TAXONKIT_DB`.\n\nAll-in-one command:\n\n    wget -c https://ftp.ncbi.nih.gov/pub/taxonomy/taxdump.tar.gz \n    tar -zxvf taxdump.tar.gz\n    \n    mkdir -p $HOME/.taxonkit\n    cp names.dmp nodes.dmp delnodes.dmp merged.dmp $HOME/.taxonkit\n    \n**Update dataset**: Simply re-download the taxdump files, uncompress and override old ones.\n\n## Installation\n\nGo to [Download Page](https://bioinf.shenwei.me/taxonkit/download) for more download options and changelogs.\n\n`TaxonKit` is implemented in [Go](https://golang.org/) programming language,\n executable binary files **for most popular operating systems** are freely available\n  in [release](https://github.com/shenwei356/taxonkit/releases) page.\n\n#### Method 1: Download binaries (latest stable/dev version)\n\nJust [download](https://github.com/shenwei356/taxonkit/releases) compressed\nexecutable file of your operating system,\nand uncompress it with `tar -zxvf *.tar.gz` command or other tools.\nAnd then:\n\n1. **For Linux-like systems**\n    1. If you have root privilege simply copy it to `/usr/local/bin`:\n\n            sudo cp taxonkit /usr/local/bin/\n\n    1. Or copy to anywhere in the environment variable `PATH`:\n\n            mkdir -p $HOME/bin/; cp taxonkit $HOME/bin/\n\n1. **For Windows**, just copy `taxonkit.exe` to `C:\\WINDOWS\\system32`.\n\n#### Method 2: Install via conda  (latest stable version) [![Install-with-conda](https://anaconda.org/bioconda/taxonkit/badges/installer/conda.svg)](https://bioinf.shenwei.me/taxonkit/download/) [![Anaconda Cloud](https://anaconda.org/bioconda/taxonkit/badges/version.svg)](https://anaconda.org/bioconda/taxonkit) [![downloads](https://anaconda.org/bioconda/taxonkit/badges/downloads.svg)](https://anaconda.org/bioconda/taxonkit)\n\n    conda install -c bioconda taxonkit\n\n#### Method 3: Install via homebrew (out of date)\n\n    brew install brewsci/bio/taxonkit\n    \n#### Method 4: Compile from source (latest stable/dev version)\n\n1. [Install go](https://go.dev/doc/install)\n\n        wget https://go.dev/dl/go1.24.1.linux-amd64.tar.gz\n\n        tar -zxf go1.24.1.linux-amd64.tar.gz -C $HOME/\n\n        # or \n        #   echo \"export PATH=$PATH:$HOME/go/bin\" \u003e\u003e ~/.bashrc\n        #   source ~/.bashrc\n        export PATH=$PATH:$HOME/go/bin\n\n2. Compile TaxonKit\n\n        # ------------- the latest stable version -------------\n\n        go get -v -u github.com/shenwei356/taxonkit/taxonkit\n\n        # The executable binary file is located in:\n        #   ~/go/bin/taxonkit\n        # You can also move it to anywhere in the $PATH\n        mkdir -p $HOME/bin\n        cp ~/go/bin/taxonkit $HOME/bin/\n\n\n        # --------------- the development version --------------\n\n        git clone https://github.com/shenwei356/taxonkit\n        cd taxonkit/taxonkit/\n        go build\n\n        # The executable binary file is located in:\n        #   ./taxonkit\n        # You can also move it to anywhere in the $PATH\n        mkdir -p $HOME/bin\n        cp ./taxonkit $HOME/bin/\n\n\n## Bash-completion\n\nSupported shell: bash|zsh|fish|powershell\n\nBash:\n\n    # generate completion shell\n    taxonkit genautocomplete --shell bash\n\n    # configure if never did.\n    # install bash-completion if the \"complete\" command is not found.\n    echo \"for bcfile in ~/.bash_completion.d/* ; do source \\$bcfile; done\" \u003e\u003e ~/.bash_completion\n    echo \"source ~/.bash_completion\" \u003e\u003e ~/.bashrc\n\nZsh:\n\n    # generate completion shell\n    taxonkit genautocomplete --shell zsh --file ~/.zfunc/_taxonkit\n\n    # configure if never did\n    echo 'fpath=( ~/.zfunc \"${fpath[@]}\" )' \u003e\u003e ~/.zshrc\n    echo \"autoload -U compinit; compinit\" \u003e\u003e ~/.zshrc\n\nfish:\n\n    taxonkit genautocomplete --shell fish --file ~/.config/fish/completions/taxonkit.fish\n\n## Citation\n\nIf you use TaxonKit in your work, please cite:\n\n\u003e Shen, W., Ren, H., TaxonKit: a practical and efficient NCBI Taxonomy toolkit,\n\u003e Journal of Genetics and Genomics, [https://doi.org/10.1016/j.jgg.2021.03.006](https://www.sciencedirect.com/science/article/pii/S1673852721000837) [![Citation Badge](https://api.juleskreuer.eu/citation-badge.php?doi=10.1016/j.jgg.2021.03.006)](https://scholar.google.com/citations?view_op=view_citation\u0026hl=en\u0026user=wHF3Lm8AAAAJ\u0026citation_for_view=wHF3Lm8AAAAJ:ULOm3_A8WrAC)\n\n## Contact\n\n[Create an issue](https://github.com/shenwei356/taxonkit/issues) to report bugs,\npropose new functions or ask for help.\n\n## License\n\n[MIT License](https://github.com/shenwei356/taxonkit/blob/master/LICENSE)\n\n## Starchart\n\n\u003cimg src=\"https://starchart.cc/shenwei356/taxonkit.svg\" alt=\"Stargazers over time\" style=\"max-width: 100%\"\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshenwei356%2Ftaxonkit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fshenwei356%2Ftaxonkit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshenwei356%2Ftaxonkit/lists"}