{"id":15220436,"url":"https://github.com/rhysnewell/aviary","last_synced_at":"2025-07-12T15:34:33.287Z","repository":{"id":40511585,"uuid":"271448699","full_name":"rhysnewell/aviary","owner":"rhysnewell","description":"A hybrid assembly and MAG recovery pipeline (and more!)","archived":false,"fork":false,"pushed_at":"2025-06-25T21:52:20.000Z","size":41852,"stargazers_count":97,"open_issues_count":22,"forks_count":15,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-06-25T22:31:08.771Z","etag":null,"topics":["assembly","binning","bioinformatics","metagenomics","workflow"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rhysnewell.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2020-06-11T04:10:12.000Z","updated_at":"2025-06-12T11:58:59.000Z","dependencies_parsed_at":"2023-12-20T14:16:38.469Z","dependency_job_id":"4bc8f992-dfa7-4d59-aade-341eb98a5fea","html_url":"https://github.com/rhysnewell/aviary","commit_stats":{"total_commits":533,"total_committers":16,"mean_commits":33.3125,"dds":0.5816135084427767,"last_synced_commit":"3c9b501819871e4638863680aff423527a1a7921"},"previous_names":[],"tags_count":30,"template":false,"template_full_name":null,"purl":"pkg:github/rhysnewell/aviary","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rhysnewell%2Faviary","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rhysnewell%2Faviary/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rhysnewell%2Faviary/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rhysnewell%2Faviary/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rhysnewell","download_url":"https://codeload.github.com/rhysnewell/aviary/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rhysnewell%2Faviary/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261965928,"owners_count":23237614,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["assembly","binning","bioinformatics","metagenomics","workflow"],"created_at":"2024-09-28T13:09:22.858Z","updated_at":"2025-07-12T15:34:33.200Z","avatar_url":"https://github.com/rhysnewell.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat)](http://bioconda.github.io/recipes/aviary/README.html)\n![](https://anaconda.org/bioconda/aviary/badges/license.svg)\n![](https://anaconda.org/bioconda/aviary/badges/version.svg)\n![](https://anaconda.org/bioconda/aviary/badges/latest_release_relative_date.svg)\n![](https://anaconda.org/bioconda/aviary/badges/platforms.svg)\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.10806928.svg)](https://doi.org/10.5281/zenodo.10806928)\n\n\n![](docs/_include/images/aviary_logo.png)\n\n# Aviary\nAn easy to use for wrapper for a robust snakemake pipeline for metagenomic short-read, long-read, and hybrid assembly. \nAviary also performs binning, annotation, strain diversity analyses,a nd provides users with an easy way to combine and \ndereplicate many aviary results with rapidity. The pipeline currently includes a series of distinct, yet flexible, modules\nthat can seamlessly communicate with each other. Each module can be run independently or as a single pipeline depending on provided input.\n\n[Please refer to the full docs here](https://rhysnewell.github.io/aviary)\n\n# Quick Installation\n\nYour conda channels should be configured ideally in this order:\n```\nconda config --add channels defaults\nconda config --add channels bioconda\nconda config --add channels conda-forge\n```\n\nYour resulting `.condarc` file should look something like:\n```\nchannels:\n  - conda-forge\n  - bioconda\n  - defaults\n```\n\n#### Option 1: Install from Bioconda\n\nConda can handle the creation of the environment for you directly:\n\n```\nconda create -n aviary -c bioconda aviary\n```\n\nOr install into existing environment:\n```\nconda install -c bioconda aviary\n```\n\n#### Option 2: Install from pip\n\nCreate the environment using the `aviary.yml` file then install from pip:\n```\nconda env create -n aviary -f aviary.yml\nconda activate aviary\npip install aviary-genome\n```\n\n#### Option 3: Install from source\n\nInitial requirements for aviary can be downloaded using the `aviary.yml`:\n```\ngit clone https://github.com/rhysnewell/aviary.git\ncd aviary\nconda env create -n aviary -f aviary.yml\nconda activate aviary\npip install -e .\n```\nThe `aviary` executable can then be run from any directory. Since the code in\nthis directory is then used for running, any updates made there will be\nimmediately available. We recommend this mode for developing and debugging\naviary.\n\n## Checking installation\nWhatever option you choose, running `aviary --help` should return the following\noutput:\n\n```\n                    ......:::::: AVIARY ::::::......\n\n           A comprehensive metagenomics bioinformatics pipeline\n\nMetagenome assembly, binning, and annotation:\n        assemble  - Perform hybrid assembly using short and long reads, \n                    or assembly using only short reads\n        recover   - Recover MAGs from provided assembly using a variety \n                    of binning algorithms \n        annotate  - Annotate MAGs using EggNOG and GTBD-tk\n        genotype  - Perform strain diversity analysis of MAGs using Lorikeet\n        complete  - Runs each stage of the pipeline: assemble, recover, \n                    annotate, genotype in that order.\n        cluster   - Combines and dereplicates the MAGs from multiple Aviary runs\n                    using Galah\n\nIsolate assembly, binning, and annotation:\n        isolate   - Perform isolate assembly **PARTIALLY COMPLETED**\n        \nUtility modules:\n        configure - Set or overwrite the environment variables for future runs.\n\n```\n\n## Databases\n\nAviary uses programs which require access to locally stored databases. \nThese databases can be quite large, as such we recommend setting up one instance of Aviary and these databases per machine or machine cluster.\n\nThe **required** databases are as follows:\n* [GTDB](https://gtdb.ecogenomic.org/downloads)\n* [EggNog](https://github.com/eggnogdb/eggnog-mapper/wiki/eggNOG-mapper-v2.1.5-to-v2.1.8#setup)\n* [CheckM2](https://github.com/chklovski/CheckM2)\n* [SingleM](https://wwood.github.io/singlem/)\n\n### Installing databases\n\nAviary can handle the download and installation of these databases via use of the `--download` flag. Using `--download`\nwill download and install the databases into the folders corresponding to their associated environment variables. Aviary will\nask you to set these environment variables upon first running and if they are not already available. Otherwise, users can use\nthe `aviary configure` subcommand to reset the environment variables:\n\n```commandline\naviary configure -o logs/ --eggnog-db-path /shared/db/eggnog/ --gtdb-path /shared/db/gtdb/ --checkm2-db-path /shared/db/checkm2db/ --singlem-metapackage-path /shared/db/singlem/ --download\n```\n\nThis command will check if the databases exist at those given locations, if they don't then aviary will download and change\nthe conda environment variables to match those paths. \n\n**N.B.** Again, these databases are VERY large. Please talk to your sysadmin/bioinformatics specialist about setting a shared\nlocation to install these databases to prevent unnecessary storage use. Additionally, the `--download` flag can be used within\nany aviary module to check that databases are configured properly.\n\n### Environment variables\n\nUpon first running Aviary, you will be prompted to input the location for several database folders if\nthey haven't already been provided. If at any point the location of these folders change you can\nuse the the `aviary configure` module to update the environment variables used by aviary.\n\nThese environment variables can also be configured manually, just set the following variables in your `.bashrc` file:\n```\nexport GTDBTK_DATA_PATH=/path/to/gtdb/gtdb_release220/db/ # https://gtdb.ecogenomic.org/downloads\nexport EGGNOG_DATA_DIR=/path/to/eggnog-mapper/2.1.8/ # https://github.com/eggnogdb/eggnog-mapper/wiki/eggNOG-mapper-v2.1.5-to-v2.1.8#setup\nexport SINGLEM_METAPACKAGE_PATH=/path/to/singlem_metapackage.smpkg/\nexport CHECKM2DB=/path/to/checkm2db/\nexport CONDA_ENV_PATH=/path/to/conda/envs/\n```\n\n# Workflow\n![Aviary workflow](figures/aviary_workflow.png)\n\n\n# Citations\nIf you use aviary then please be aware that you are using a great number of other programs and aviary wrapping around them.\nYou should cite all of these tools as well, or whichever tools you know that you are using. To make this easy for you\nwe have provided the following list of citations for you to use in alphabetical order. This list will be updated as new\nmodules are added to aviary.\n\nA constantly updating list of citations can be found in the [Citations document](https://rhysnewell.github.io/aviary/citations).\n\n# License\n\nCode is [GPL-3.0](LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frhysnewell%2Faviary","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frhysnewell%2Faviary","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frhysnewell%2Faviary/lists"}