https://github.com/ncrnalab/ribofy

ORF detection using RiboSeq data
https://github.com/ncrnalab/ribofy

Last synced: 6 days ago
JSON representation

ORF detection using RiboSeq data

Host: GitHub
URL: https://github.com/ncrnalab/ribofy
Owner: ncrnalab
License: mit
Created: 2021-06-24T11:16:06.000Z (about 4 years ago)
Default Branch: main
Last Pushed: 2021-08-19T07:12:04.000Z (almost 4 years ago)
Last Synced: 2024-08-05T15:04:29.387Z (11 months ago)
Language: Python
Size: 95.7 KB
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-riboseq - Code - sites | (ORF Calling)

README

# ribofy: ORF detection using RiboSeq data

Ribofy is a fast and simple python-based tool for detection of phased p-sites across open-reading-frames (ORFs)

## Installation

* pip (soon)
```
pip install ribofy
```

* from source
```
git clone https://github.com/ncrnalab/ribofy.git
cd ribofy
python setup.py install
```

## Running ribofy

First, all ORFs are assembled from an annotation file (preferably [gencode](https://www.gencodegenes.org/) GTF) and the corresponding genome fasta (should not take more than 5-10 minutes). This is only required once per genome/annotation:

```
ribofy orfs --gtf --fa
```

The genome fasta-file must be indexed prior to ORF assembly:
```
samtools faidx
```

Currently, ribofy is compatible with STAR, kallisto and salmon mapped reads. Recommended mapping commands:

* STAR
```
STAR --genomeDir --outSAMtype BAM SortedByCoordinate
--readFilesIn --readFilesCommand zcat --outFileNamePrefix .
```
* salmon
```
salmon quant -i --gcBias --validateMappings {additional_params} --writeMappings= -o
```
* kallisto
```
kallisto quant -i --bias -o --single --pseudobam --fr-stranded -l 30 -s 2
```

Note that for kallisto and salmon, genome indexing should be performed with reduced k-mer value to allow mapping of <30nt ribosome-protected fragments.

Before running ribofy, bam-files should be sorted and indexed:
```
samtools sort >
samtools index
```

Then, run ribofy:

```
ribofy detect --orfs --bams --prefix
```

## Under the hood

1) Ribofy infers the p-site offsets for read-lengths between 25 and 35 (although this can be customized) and outputs the \.offset.txt

2) Then, for each ORF, ribofy counts the p-sites and evaluates the statistical enrichment of in-frame p-sites. This outputs the \.phasing.txt

3) Finally, Ribofy collects the individual ORFs into ORF-groups (collapsing overlapping and correlating ORFs), preserving only the highest expressed ORF (based on overall coverage), performs ORF-type specific FDR corrections and outputs the final \.results.txt

## Citation
*in preparation*

## Contact
Thomas Hansen ([email protected])

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ncrnalab/ribofy

Awesome Lists containing this project

README