{"id":48232276,"url":"https://github.com/bactopia/bactopia-msystems-2020","last_synced_at":"2026-04-04T19:42:24.448Z","repository":{"id":110398092,"uuid":"433564151","full_name":"bactopia/bactopia-msystems-2020","owner":"bactopia","description":"Scripts and data associated with Bactopia publication in mSystems","archived":false,"fork":false,"pushed_at":"2021-11-30T19:46:40.000Z","size":18589,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-09-09T13:39:40.269Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bactopia.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2021-11-30T19:45:11.000Z","updated_at":"2021-11-30T19:46:45.000Z","dependencies_parsed_at":"2023-03-08T21:15:33.299Z","dependency_job_id":null,"html_url":"https://github.com/bactopia/bactopia-msystems-2020","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/bactopia/bactopia-msystems-2020","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bactopia%2Fbactopia-msystems-2020","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bactopia%2Fbactopia-msystems-2020/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bactopia%2Fbactopia-msystems-2020/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bactopia%2Fbactopia-msystems-2020/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bactopia","download_url":"https://codeload.github.com/bactopia/bactopia-msystems-2020/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bactopia%2Fbactopia-msystems-2020/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31411350,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-04T19:29:44.979Z","status":"ssl_error","status_checked_at":"2026-04-04T19:29:11.535Z","response_time":60,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-04-04T19:42:23.251Z","updated_at":"2026-04-04T19:42:24.430Z","avatar_url":"https://github.com/bactopia.png","language":"HTML","funding_links":[],"categories":[],"sub_categories":[],"readme":"\nBelow is a description of the files in this directory and subdirectories.\n\n# `data` Folder\n```\n└── data\n    ├── bactopia-analysis.html\n    ├── fastani\n    │   └── crispatus-include.txt\n    ├── gtdb\n    │   ├── exclude.txt\n    │   ├── gtdbtk.filtered.tsv\n    │   └── gtdbtk.summary.tsv\n    ├── lactobacillus-accessions.txt\n    ├── lactobacillus-results.txt\n    ├── lactobacillus-summary.txt\n    ├── phyloflash\n    │   ├── phyloflash-alignment.fasta.gz\n    │   ├── phyloflash-contree.txt\n    │   ├── phyloflash-iqtree.txt\n    │   ├── phyloflash-merged.fasta.gz\n    │   └── phyloflash-summary.txt\n    ├── roary\n    │   ├── core-genome.aligned.fa.gz\n    │   ├── core-genome.contree\n    │   ├── core-genome.distance.txt\n    │   └── core-genome.iqtree\n    └── summary\n        ├── amrfinder\n        │   ├── amrfinder-gene-detailed-summary.txt\n        │   ├── amrfinder-gene-summary.txt\n        │   ├── amrfinder-protein-detailed-summary.txt\n        │   └── amrfinder-protein-summary.txt\n        ├── ariba\n        │   ├── ariba-card-detailed-summary.txt\n        │   ├── ariba-card-summary.txt\n        │   ├── ariba-vfdb_core-detailed-summary.txt\n        │   └── ariba-vfdb_core-summary.txt\n        ├── lactobacillus-exclude.txt\n        ├── lactobacillus-report.txt\n        └── lactobacillus-summary.txt\n```\nThis directory contains the files used to create the final results and phylogenies.\n\n| Filename | Description |\n|----------|-------------|\n| bactopia-analysis.html | HTML output created from the R Markdown script bactopia-analysis.Rmd in the scripts directory |\n\n### Directories\n#### `fastani`\n\n| Filename | Description |\n|----------|-------------|\n| crispatus-include.txt| List of genomes with \u003e 95% ANI to *Lactobacillus crispatus* |\n\n#### `gtdb`\n\n| Filename | Description |\n|----------|-------------|\n| exclude.txt | Genomes not classified as Lactobacillus |\n| gtdbtk.filtered.tsv| List of genomes with an insufficient number of amino acids in MSA |\n| gtdbtk.summary.tsv| A summary of classifications provided by GTDB-Tk, see classification summary for more details |\n\n#### `phyloflash`\n\n| Filename | Description |\n|----------|-------------|\n| phyloflash-alignment.fasta.gz | The multiple sequence alignment of 16S genes |\n| phyloflash-contree.txt| Consensus tree with assigned branch supports created from 16S alignments  |\n| phyloflash-iqtree.txt| Full result of the run, this is the main report file |\n| phyloflash-merged.fasta.gz| All 16S genes used in the multiple sequence alignment |\n| phyloflash-summary.txt| The aggregated phyloFlash results of all samples |\n\n\n#### `roary`\nThe results of the *Lactobacillus crispatus* pan-genome\n\n| Filename | Description |\n|----------|-------------|\n| core-genome.aligned.fa.gz| The multiple sequence alignment of core genes |\n| core-genome.contree | Consensus tree with assigned branch supports created from 16S alignments  |\n| core-genome.distance.txt | Pairwise core genome SNP distance matrix |\n| core-genome.iqtree | Full result of the IQTree run, this is the main report file |\n\n#### `summary`\n\n| Filename | Description |\n|----------|-------------|\n| {amrfinder\\|ariba}-{gene\\|protein\\|card\\|vfdb}-detailed-summary.txt | Detailed information about each hit against a specific antimicrobial resistance or Ariba dataset |\n| {amrfinder\\|ariba}-{gene\\|protein\\|card\\|vfdb}-summary.txt | A presence/absence matrix for hits against a specific antimicrobial resistance or Ariba dataset  |\n| lactobacillus-exclude.txt | A list of samples and the reason they failed quality cutoffs |\n| lactobacillus-report.txt| A tab-delimited file containing sequence, assembly and annotation stats for all samples|\n| lactobacillus-summary.txt| Brief breakdown of ranks and qc-failures |\n\n\n# `figures\\files\\tables` Folders\n```\n\n└── figures\n    ├── figure-1a-bactopia-overview.png\n    ├── figure-1b-bactopia-workflow.pdf\n    ├── figure-1b-bactopia-workflow.png\n    ├── figure-1b-bactopia-workflow.svg\n    ├── figure-2a-lactobacillus-16s.png\n    ├── figure-2a-lactobacillus-16s.svg\n    ├── figure-2b-lactobacillus-only-16s-annotated.png\n    ├── figure-2b-lactobacillus-only-16s-annotated.svg\n    ├── figure-2b-lactobacillus-only-16s.svg\n    ├── figure-3-lcrispatus-core-genome-annotated.png\n    ├── figure-3-lcrispatus-core-genome-annotated.svg\n    ├── figure-3-lcrispatus-core-genome.svg\n    ├── supplementary-figure-1-bactopia-workflow.pdf\n    ├── supplementary-figure-1-bactopia-workflow.png\n    ├── supplementary-figure-1-bactopia-workflow.svg\n    ├── supplementary-figure-2-quality-by-year.pdf\n    ├── supplementary-figure-2-quality-by-year.png\n    ├── supplementary-figure-3-genome-size-assembly-vs-estimate.pdf\n    ├── supplementary-figure-3-genome-size-assembly-vs-estimate.png\n    ├── supplementary-figure-4-consistent-genome-size.pdf\n    └── supplementary-figure-4-consistent-genome-size.png\n└──  files\n    ├── supplementary-data-1-lactobacillus-query-results.txt\n    ├── supplementary-data-2-illumina-accessions.txt\n    └── supplementary-data-3-nextflow-report.html\n└── tables\n    ├── supplementary-table-1-samples-excluded.txt\n    ├── supplementary-table-2-non-lactobacillus-by-gtdb.txt\n    ├── table-1-list-of-bioinformatic-tools.txt\n    ├── table-2-comparison-of-workflows.txt\n    ├── table-3-lactobacillus-sequence-summary.txt\n    └── table-4-lactobacillus-crispatus-metadata.txt\n```\n\n| Filename / Directory | Description |\n|----------|-------------|\n| figures | Figures used in preprint |\n| files | Supplementary data in the preprint |\n| tables | Tab-delimited representations of tables in the preprint |\n\n# `scripts` Folder\n```\n└── scripts\n    ├── bactopia-analysis.Rmd\n    ├── bactopia-workflow-key.R\n    ├── bactopia-workflow.R\n    └── lactobacillus-analysis.sh\n```\n\nThis directory contains the scripts used in this analysis.\n\n| Filename | Description |\n|----------|-------------|\n| bactopia-analysis.Rmd | This is the primary script for analysis results, used to create bactopia-analysis.html |\n| bactopia-workflow-key.R | Used to create the key for the Bactopia Workflow diagram |\n| bactopia-workflow.R | Used to create the Bactopia Workflow diagram |\n| lactobacillus-analysis.sh | Commands used to run Bactopia |\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbactopia%2Fbactopia-msystems-2020","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbactopia%2Fbactopia-msystems-2020","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbactopia%2Fbactopia-msystems-2020/lists"}