{"id":18598233,"url":"https://github.com/cmdoret/acastellanii_genome_annotation","last_synced_at":"2025-05-16T14:11:16.365Z","repository":{"id":46680945,"uuid":"206617319","full_name":"cmdoret/Acastellanii_genome_annotation","owner":"cmdoret","description":"Reproducible container-based pipeline for the annotation of the Acanthamoeba castellanii genome","archived":false,"fork":false,"pushed_at":"2023-02-08T15:34:46.000Z","size":59,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-02-17T23:47:42.537Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cmdoret.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-09-05T17:11:24.000Z","updated_at":"2023-04-01T10:28:53.000Z","dependencies_parsed_at":"2024-11-07T01:40:50.150Z","dependency_job_id":"045e9b87-e49f-4f25-bca9-650ad88d5ed7","html_url":"https://github.com/cmdoret/Acastellanii_genome_annotation","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cmdoret%2FAcastellanii_genome_annotation","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cmdoret%2FAcastellanii_genome_annotation/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cmdoret%2FAcastellanii_genome_annotation/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cmdoret%2FAcastellanii_genome_annotation/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cmdoret","download_url":"https://codeload.github.com/cmdoret/Acastellanii_genome_annotation/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254544158,"owners_count":22088808,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-07T01:31:45.183Z","updated_at":"2025-05-16T14:11:16.349Z","avatar_url":"https://github.com/cmdoret.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Acanthamoeba castellanii genome annotation\n*cmdoret, 20190905*\n\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.5541742.svg)](https://doi.org/10.5281/zenodo.5541742)\n\nThis pipeline allows to reproduce the automatic annotation procedure used for the A. castellanii genome assembly. Most of the work is done using [funannotate](https://github.com/nextgenusfs/funannotate). Each step of the pipeline is run inside a singularity container. The only dependencies are python\u003e=3.7, funannotate, snakemake and conda.\n\nThe associated Zenodo record holds a frozen copy of the repository as well as the input and output data. The input data (genome assemblies used in the publication) are automatically downloaded from Zenodo upon execution of the pipeline.\n\n### Configuration\n\nThere are 3 configuration files:\n  * `config.yaml`: General pipeline parameters as well as path to the output and temporary folders.\n  * `samples.tsv`: Assemblies to annotate.\n  * `units.tsv`: Describes input reads files to use for evidence during annotation.\n\n### Installation\n\nFunannotate must be installed in the current conda environment and setup according to the official instructions (including setting the $FUNANNOTATE_DB variable). Snakemake \u003e=5.5 is also required to run the pipeline. \n\n[Eggnog-mapper](https://github.com/eggnogdb/eggnog-mapper/wiki/eggNOG-mapper-v2#Installation) and [interpsoscan](https://github.com/ebi-pf-team/interproscan/wiki/HowToDownload) should also be available as they will be called by funannotate to improve the annotations.\n\n### Usage\n\nIf you funannotate is available in your cuurrent conda environment, you can the pipeline with the following command:\n```bash\nsnakemake --use-conda -j4\n```\n\n### Description\n\nThe pipeline works as follows:\n  0. Download the assemblies from zenodo and rnaseq reads from SRA\n  1. Clean the input assembly (rename headers, sort and filter scaffolds)\n  2. Soft mask repeats from the assembly\n  3. Use RNAseq data to predict genes with AUGUSTUS\n  4. Use remote services for functional annotations (Eggnog-mapper, interproscan, phobius) \n  5. Combine functional annotations from the different sources.\n\n![Pipeline steps](docs/filegraph.svg)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcmdoret%2Facastellanii_genome_annotation","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcmdoret%2Facastellanii_genome_annotation","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcmdoret%2Facastellanii_genome_annotation/lists"}