Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/rhysnewell/gtdb_genome_filter

Small script to parse and filter out genomes in the GTDB
https://github.com/rhysnewell/gtdb_genome_filter

Last synced: 3 days ago
JSON representation

Small script to parse and filter out genomes in the GTDB

Host: GitHub
URL: https://github.com/rhysnewell/gtdb_genome_filter
Owner: rhysnewell
License: gpl-3.0
Created: 2021-01-21T02:57:33.000Z (almost 4 years ago)
Default Branch: main
Last Pushed: 2021-01-21T04:00:45.000Z (almost 4 years ago)
Last Synced: 2024-10-12T06:22:35.307Z (about 1 month ago)
Language: Python
Size: 20.5 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# gtdb_genome_filter
Small script to parse and filter out genomes in the GTDB

# Installation

```
git clone https://github.com/rhysnewell/gtdb_genome_filter.git
cd gtdb_genome_filter
pip install --editable .
gtdb_filter filter --help
```

# Requirements

Initial requirements for aviary can be downloaded using the `aviary.yml`:
```
conda env create -n aviary -f aviary.yml
```

# Usage

To perform mag recovery:
```
aviary recover --assembly scaffolds.fasta --short_reads_1 sr1.1.fq sr2.1.fq.gz --short_reads_2 sr1.2.fq sr2.2.fq.gz --longreads nanopore.fastq.gz --output output_dir/ --max_threads 24 --n_cores 24 --gtdb_path /path/to/gtdb/release/
```

# Batch Files

Instead of providing aviary with an assembly and reads, you can provide it a batch file in the following format:

```
/Absolute/Path/to/Assembly1.fasta Unique_ID_1 /absolute/path/to/read_set_1.1.fq.gz /absolute/path/to/read_set_1.2.fq.gz /absolute/path/to/read_set_2.1.fq.gz /absolute/path/to/read_set_2.2.fq.gz
/Absolute/Path/to/Assembly2.fasta Unique_ID_2 /absolute/path/to/read_set_3.1.fq.gz /absolute/path/to/read_set_3.2.fq.gz /absolute/path/to/read_set_4.1.fq.gz /absolute/path/to/read_set_4.2.fq.gz
```

Then specify the path to your batch file in the config.yaml, ignoring inputs for fasta and any reads, and use the following command:
`snakemake --use-conda --cores 24 run_batch`