Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nextomics/pipeline-for-isoseq
A pipeline for isoseq
https://github.com/nextomics/pipeline-for-isoseq
Last synced: 14 days ago
JSON representation
A pipeline for isoseq
- Host: GitHub
- URL: https://github.com/nextomics/pipeline-for-isoseq
- Owner: Nextomics
- Created: 2017-01-06T13:33:15.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2018-06-08T01:35:55.000Z (over 6 years ago)
- Last Synced: 2023-10-20T22:02:58.267Z (about 1 year ago)
- Language: Perl
- Size: 35.2 KB
- Stars: 22
- Watchers: 7
- Forks: 12
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
### 1. Quality control
### 2. Classification
QC and Classification, protocol version="2.3.0" id="RS_IsoSeq.1", default parameters.
```
smrtpipe.py --distribute --params=settings.xml --output=outputdir xml:input.xml 2> smrtpipe.stderr 1> smrtpipe.stdout
```### 3. Clustering
Mapping and phasing
```
perl phase_allotetraploid_pipeline.pl –flnc flnc.fastq --gmap_genome_directory database/ --gmap_genome_database databasename –outdir ./result --reference_fasta ref.fasta
```Doing isoform-level-cluster according to alignments.
```
python collapse_isoforms_by_sam.py -c 0.90 -i 0.90 --input flnc.fastq --fq -s flnc.sort.sam -o all
```
Consensus, each cluster generate one consensus sequence.
```
perl analysis_cluster.pl all.collapsed.group.txt flnc.sort.sam flnc.fastq > flnc.best.sort.sam
```
Doing isoform-level-cluster again.
```
python collapse_isoforms_by_sam.py -c 0.90 -i 0.90 --input chose.fq --fq -s flnc.best.sort.sam -o all.consensus
```
Convert bam format to gff format.
```
samtools view -bS all.consensus.collapsed.rep.fq.sam > all.consensus.bam
bedtools bamtobed -split -i all.consensus.bam > all.consensus.bed
perl bed2cDNA_match.pl all.consensus.collapsed.rep.fq all.consensus.collapsed.rep.fq.sam > all.consensus.cDNA_match.gff
```### 4. Transcriptome analysis
Alternative splicing analysis.
```
python alternative_splice.py -i all.consensus.cDNA_match.gff -g ref.gtf -f ref.fasta -o ./ -os -as -ats T -op
```Alternative polyadenylation analysis.
```
perl polyA_position.pl all.consensus.collapsed.gff all.consensus.collapsed.rep.fq flnc.sort.sam > transcript_polyA.result
```
Finding fusion gene.
```
python fusion_finder.py --input flnc.fastq --fq -s flnc.sort.sam -o ./fusion
```
Finding non-coding RNA.
```
python PLEKModelling.py -lncRNA high_quality_lncRNA.fa -prefix species -mRNA mRNA.fasta
python PLEK.py -fasta flnc.fasta -out lncRNA.predicted -thread 10 -range species.range -model species.model -k 4
```### 5. program list
#### changed to https://github.com/fancy1124/pipeline-for-isoseq
- smrtanalysis (http://www.pacb.com/products-and-services/analytical-software/smrt-analysis/)
- gmap (http://research-pub.gene.com/gmap/)
- phase_allotetraploid_pipeline.pl (attached)
- collapse_isoforms_by_sam.py from pbtranscript-tofu (https://github.com/PacificBiosciences/cDNA_primer)
- analysis_cluster.pl in-house Perl script
- bed2cDNA_match.pl in-house Perl script
- samtools (http://www.htslib.org/)
- bedtools (https://github.com/arq5x/bedtools2/)
- alternative_splice.py developed by ourselves (attached)
- polyA_position.pl
- fusion_finder.py (https://github.com/PacificBiosciences/cDNA_primer/blob/master/pbtranscript-tofu/pbtranscript/pbtools/pbtranscript/fusion_finder.py)
- PLEKModelling.py from plek (https://sourceforge.net/projects/plek/files/)
- PLEK.py from plek (https://sourceforge.net/projects/plek/files/)