https://github.com/poisonalien/gtf2fasta
Tookit for gtf format conversion and sequence extraction
https://github.com/poisonalien/gtf2fasta
gtf rna-seq
Last synced: 6 months ago
JSON representation
Tookit for gtf format conversion and sequence extraction
- Host: GitHub
- URL: https://github.com/poisonalien/gtf2fasta
- Owner: PoisonAlien
- License: mit
- Created: 2017-04-17T09:52:04.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2024-01-26T14:55:06.000Z (over 1 year ago)
- Last Synced: 2025-04-30T10:49:02.958Z (6 months ago)
- Topics: gtf, rna-seq
- Language: Julia
- Size: 10.7 KB
- Stars: 5
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# gtf2fasta
A minimal tool for sequence extraction for every transcript (only from exonic regions) in a [gtf](http://mblab.wustl.edu/GTF22.html) from a fasta file.It's written in [Julia](https://julialang.org/) and has no dependencies. See [here](https://julialang.org/downloads/) for Julia installation.
This tool requires [indexed fasta file](http://www.htslib.org/doc/faidx.html) for memory efficient sequence extraction.
## Usage
```bash
#Extracting fasta sequence from gtf file (sequences are written to stdout)$ gtf2fasta.jl ens82.gtf hg19.fa | head
>ENST00000456328
GTTAACTTGCCGTCAGCCTTTTCTTTGACCTCTTCTTTCTGTTCATGTGTATTTGCTGTCTCTTAGCCCA
GACTTCCCGTGTCCTTTCCACCGGGCCTTTGAGAGGTCACAGGGTCTTGATGCTGTGGTCTTCATCTGCA
GGTGTCTGACTTCCAGCAACTGCTGGCCTGTGCCAGGGTGCAAGCTGAGCACTGGAGTGGAGTTTTCCTG
...
```It will also generate `ens82.gtf.transcript.dict.tsv` with transcript to gene mappings.