An open API service indexing awesome lists of open source software.

https://github.com/poisonalien/gtf2fasta

Tookit for gtf format conversion and sequence extraction
https://github.com/poisonalien/gtf2fasta

gtf rna-seq

Last synced: 6 months ago
JSON representation

Tookit for gtf format conversion and sequence extraction

Awesome Lists containing this project

README

          

# gtf2fasta
A minimal tool for sequence extraction for every transcript (only from exonic regions) in a [gtf](http://mblab.wustl.edu/GTF22.html) from a fasta file.

It's written in [Julia](https://julialang.org/) and has no dependencies. See [here](https://julialang.org/downloads/) for Julia installation.

This tool requires [indexed fasta file](http://www.htslib.org/doc/faidx.html) for memory efficient sequence extraction.

## Usage

```bash
#Extracting fasta sequence from gtf file (sequences are written to stdout)

$ gtf2fasta.jl ens82.gtf hg19.fa | head
>ENST00000456328
GTTAACTTGCCGTCAGCCTTTTCTTTGACCTCTTCTTTCTGTTCATGTGTATTTGCTGTCTCTTAGCCCA
GACTTCCCGTGTCCTTTCCACCGGGCCTTTGAGAGGTCACAGGGTCTTGATGCTGTGGTCTTCATCTGCA
GGTGTCTGACTTCCAGCAACTGCTGGCCTGTGCCAGGGTGCAAGCTGAGCACTGGAGTGGAGTTTTCCTG
...
```

It will also generate `ens82.gtf.transcript.dict.tsv` with transcript to gene mappings.