Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/adamtaranto/trf2gff
Convert Tandem Repeat Finder dat file output into gff3 format
https://github.com/adamtaranto/trf2gff
Last synced: 3 months ago
JSON representation
Convert Tandem Repeat Finder dat file output into gff3 format
- Host: GitHub
- URL: https://github.com/adamtaranto/trf2gff
- Owner: Adamtaranto
- License: gpl-3.0
- Created: 2017-01-26T04:08:16.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2024-01-17T04:09:33.000Z (about 1 year ago)
- Last Synced: 2024-01-17T10:41:58.118Z (about 1 year ago)
- Language: Python
- Size: 45.9 KB
- Stars: 15
- Watchers: 4
- Forks: 3
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
```
████████╗██████╗ ███████╗██████╗ ██████╗ ███████╗███████╗
╚══██╔══╝██╔══██╗██╔════╝╚════██╗██╔════╝ ██╔════╝██╔════╝
██║ ██████╔╝█████╗ █████╔╝██║ ███╗█████╗ █████╗
██║ ██╔══██╗██╔══╝ ██╔═══╝ ██║ ██║██╔══╝ ██╔══╝
██║ ██║ ██║██║ ███████╗╚██████╔╝██║ ██║
╚═╝ ╚═╝ ╚═╝╚═╝ ╚══════╝ ╚═════╝ ╚═╝ ╚═╝
```
Converts Tandem Repeat Finder .dat file output into GFF3 format.## Installation
TRF2GFF requires Python >= v3.8
Install directly from this git repository.
```bash
pip install git+https://github.com/Adamtaranto/TRF2GFF.git
```Or clone and install locally.
```bash
git clone https://github.com/Adamtaranto/TRF2GFF.git && cd TRF2GFF
pip install -e .
```## Usage
**Run trf**
```bash
trf genome.fa 2 6 6 80 10 50 2000 -h
# Where args are Match, Mismatch, Delta, PM, PI, Minscore, MaxPeriod, [options]
# Output: genome.fa.2.6.6.80.10.50.2000.dat
```**Convert .dat file to gff3**
Here are three examples of how you can use `trf2gff` to process a `trf` .dat file
```bash
# Option 1:
# Read from infile and write gff to default outfile
trf2gff -i genome.fa.2.6.6.80.10.50.2000.dat
# Output: genome.fa.2.6.6.80.10.50.2000.gff3# Option 2:
# Read input from stdin and write to stdout
trf2gff -o - < genome.fa.2.6.6.80.10.50.2000.dat > genome.gff3
# Output: genome.gff3# Option 3:
# Read from stdin and write to file
trf2gff -o genome.gff3 < genome.fa.2.6.6.80.10.50.2000.dat
# Output: genome.gff3
```**Extract annotated tandem-repeat features**
Use `bedtools getfasta` to extract trf features from genome.
```bash
bedtools getfasta -fi genome.fa -bed genome.gff3
```