Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/rhysnewell/gtdb_genome_filter
Small script to parse and filter out genomes in the GTDB
https://github.com/rhysnewell/gtdb_genome_filter
Last synced: 3 days ago
JSON representation
Small script to parse and filter out genomes in the GTDB
- Host: GitHub
- URL: https://github.com/rhysnewell/gtdb_genome_filter
- Owner: rhysnewell
- License: gpl-3.0
- Created: 2021-01-21T02:57:33.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2021-01-21T04:00:45.000Z (almost 4 years ago)
- Last Synced: 2024-10-12T06:22:35.307Z (about 1 month ago)
- Language: Python
- Size: 20.5 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# gtdb_genome_filter
Small script to parse and filter out genomes in the GTDB# Installation
```
git clone https://github.com/rhysnewell/gtdb_genome_filter.git
cd gtdb_genome_filter
pip install --editable .
gtdb_filter filter --help
```# Requirements
Initial requirements for aviary can be downloaded using the `aviary.yml`:
```
conda env create -n aviary -f aviary.yml
```# Usage
To perform mag recovery:
```
aviary recover --assembly scaffolds.fasta --short_reads_1 sr1.1.fq sr2.1.fq.gz --short_reads_2 sr1.2.fq sr2.2.fq.gz --longreads nanopore.fastq.gz --output output_dir/ --max_threads 24 --n_cores 24 --gtdb_path /path/to/gtdb/release/
```# Batch Files
Instead of providing aviary with an assembly and reads, you can provide it a batch file in the following format:
```
/Absolute/Path/to/Assembly1.fasta Unique_ID_1 /absolute/path/to/read_set_1.1.fq.gz /absolute/path/to/read_set_1.2.fq.gz /absolute/path/to/read_set_2.1.fq.gz /absolute/path/to/read_set_2.2.fq.gz
/Absolute/Path/to/Assembly2.fasta Unique_ID_2 /absolute/path/to/read_set_3.1.fq.gz /absolute/path/to/read_set_3.2.fq.gz /absolute/path/to/read_set_4.1.fq.gz /absolute/path/to/read_set_4.2.fq.gz
```Then specify the path to your batch file in the config.yaml, ignoring inputs for fasta and any reads, and use the following command:
`snakemake --use-conda --cores 24 run_batch`