{"id":46345853,"url":"https://github.com/tomarovsky/buscoclade","last_synced_at":"2026-03-08T09:05:38.411Z","repository":{"id":165229658,"uuid":"565888539","full_name":"tomarovsky/BuscoClade","owner":"tomarovsky","description":"Snakemake pipeline to construct species phylogenies using BUSCOs","archived":false,"fork":false,"pushed_at":"2025-11-30T16:31:01.000Z","size":474,"stargazers_count":4,"open_issues_count":2,"forks_count":2,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-12-02T21:23:54.163Z","etag":null,"topics":["busco","phylogenomics","sciworkflows","snakemake"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tomarovsky.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2022-11-14T14:38:57.000Z","updated_at":"2025-11-30T16:21:49.000Z","dependencies_parsed_at":null,"dependency_job_id":"b13b8cca-910d-4b36-b9cd-65672511d4f4","html_url":"https://github.com/tomarovsky/BuscoClade","commit_stats":{"total_commits":86,"total_committers":1,"mean_commits":86.0,"dds":0.0,"last_synced_commit":"a72b6e162604e5f7f2ffeefe7dec64e5763453f6"},"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"purl":"pkg:github/tomarovsky/BuscoClade","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tomarovsky%2FBuscoClade","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tomarovsky%2FBuscoClade/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tomarovsky%2FBuscoClade/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tomarovsky%2FBuscoClade/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tomarovsky","download_url":"https://codeload.github.com/tomarovsky/BuscoClade/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tomarovsky%2FBuscoClade/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30094003,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-04T20:42:30.420Z","status":"ssl_error","status_checked_at":"2026-03-04T20:42:30.057Z","response_time":59,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["busco","phylogenomics","sciworkflows","snakemake"],"created_at":"2026-03-04T21:34:49.194Z","updated_at":"2026-03-08T09:05:38.406Z","avatar_url":"https://github.com/tomarovsky.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Snakemake workflow: BuscoClade\n\n[![Snakemake](https://img.shields.io/badge/snakemake==9.16-brightgreen.svg)](https://snakemake.github.io)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n## Description\n\nPipeline to construct species phylogenies using [BUSCO](https://busco.ezlab.org/).\n\n![Workflow scheme](./workflow.png)\n\n- Alignment: [PRANK](http://wasabiapp.org/software/prank/), [MAFFT](https://mafft.cbrc.jp/alignment/software/).\n- Trimming: [GBlocks](https://academic.oup.com/mbe/article/17/4/540/1127654), [TrimAl](http://trimal.cgenomics.org/).\n- Phylogenetic tree constraction: [IQTree](http://www.iqtree.org/), [MrBayes](https://nbisweden.github.io/MrBayes/), [ASTRAL-IV](https://doi.org/10.1093/molbev/msaf172), [RapidNJ](https://birc.au.dk/software/rapidnj), [PHYLIP](https://phylipweb.github.io/phylip/), [RAxML-NG](https://github.com/amkozlov/raxml-ng).\n- Visualization: [Etetoolkit](http://etetoolkit.org/), [Matplotlib](https://matplotlib.org/stable/).\n\n## Usage\n\n### Step 1. Deploy workflow\n\nTo use this workflow, you can either download and extract the [latest release](https://github.com/tomarovsky/BuscoClade/releases) or clone the repository:\n\n```\ngit clone https://github.com/tomarovsky/BuscoClade.git\n```\n\n### Step 2. Add species genomes\n\nPlace your FASTA genome assemblies into the `genomes/` directory. Keep in mind that the file prefixes will influence the output phylogeny. The pipeline supports FASTA files with the extensions `.fasta`, `.fna`, and `.fa`, including their gzipped versions (e.g., `.fasta.gz`, `.fna.gz`, `.fa.gz`).\n\n### Step 3. Configure workflow\n\nTo set up the workflow, modify `config/default.yaml`. I recommend to copy config gile and do all modifications in this copy. Some of the options (all nonested options from default.yaml) could also be set via command line using `--config` flag. Sections of config file:\n\n- **Pipeline Configuration:**\nThis section outlines the workflow. By default, it includes alignments and following filtration of nucleotide sequences, and all tools for phylogeny reconstruction, except for MrBayes (it is recommended to run the GPU compiled version separately). To disable a tool, set its value to `False` or comment out the corresponding line.\n\n- **Tool Parameters:**\nSpecify parameters for each tool. To perform BUSCO, it is important to specify:\n  - `busco_dataset_path`: Download the BUSCO dataset beforehand and specify its path here.\n  - `busco_params`: Use the `--offline` flag and the `--download_path` parameter, indicating the path to the `busco_downloads/` directory.\n\n- **Directory structure:**\nDefine output file structure in the `results/` directory. It is recommended to leave it unchanged.\n\n- **Resources:**\nSpecify Slurm queue, threads, memory, and runtime for each tool.\n\n### Step 4. Execute workflow\n\nInstall snakemake:\n\n```\nmamba create -c conda-forge -c bioconda -c nodefaults -n snakemake snakemake snakemake-executor-plugin-cluster-generic\nmamba activate snakemake\n```\n\nFor a dry run:\n\n```\nsnakemake --profile profile/slurm/ --configfile config/default.yaml --dry-run\n```\n\nSnakemake will print all the rules that will be executed. Remove `--dry-run` to initiate the actual run.\n\n**How to run the workflow if I have completed BUSCOs?**\n\nFirst, move the genome assemblies to the `genomes/` directory or create empty files with corresponding names. Then, create a `results/busco/` directory and move the BUSCO output directories into it. Note that BUSCO output must be formatted. Thus, for `Ailurus_fulgens.fasta` BUSCO output should look like this:\n\n```\nresults/\n    busco/\n        Ailurus_fulgens/\n            busco_sequences/\n                fragmented_busco_sequences/\n                multi_copy_busco_sequences/\n                single_copy_busco_sequences/\n            hmmer_output/\n            logs/\n            metaeuk_output/\n            full_table_Ailurus_fulgens.tsv\n            missing_busco_list_Ailurus_fulgens.tsv\n            short_summary_Ailurus_fulgens.txt\n            short_summary.json\n            short_summary.specific.mammalia_odb10.Ailurus_fulgens.json\n            short_summary.specific.mammalia_odb10.Ailurus_fulgens.txt\n```\n\n## Contact\n\nPlease email me at: \u003candrey.tomarovsky@gmail.com\u003e for any questions or feedback.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftomarovsky%2Fbuscoclade","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftomarovsky%2Fbuscoclade","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftomarovsky%2Fbuscoclade/lists"}