https://github.com/likelet/exomepipe
https://github.com/likelet/exomepipe
nextflow ngs pipeline
Last synced: 3 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/likelet/exomepipe
- Owner: likelet
- License: gpl-3.0
- Created: 2017-11-30T09:21:04.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2020-09-24T01:53:17.000Z (over 4 years ago)
- Last Synced: 2025-01-07T20:45:54.748Z (4 months ago)
- Topics: nextflow, ngs, pipeline
- Language: Nextflow
- Homepage:
- Size: 195 KB
- Stars: 7
- Watchers: 4
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ExomePipe
## dependencies
* fastp
* bwa
* picard/samtools
* gatk4 (CNV analysis)
* mutect2
* annovar and annovar DB(deprecated)
* vep and vep DB(very huge ~15G)
* vcf2maftools
* MAFtools(only avaliable when multiple sample paired envolved)
* some in-house script
* msisensor* Dependencies for FREEC
* Control-freeC
* bedtools
* samtools
* FACET
* snp-pileup* NGScheckmate
* bcftools
* python
* samtools
# Input file* `sample input file ` TSV file for sample
> adjusted from https://github.com/SciLifeLab/Sarek/blob/master/docs/INPUT.md
# TSV file for sampleInput files for ExomeSeqPipe can be specified using a tsv file given to the `--sample` parameter. The tsv file is a Tab Separated Value file with columns: `subject gender status sample lane fastq1 fastq2` or `subject gender status sample bam bai`.
The content of these columns should be quite straight-forward:- `subject` designate the subject, it should be the ID of the Patient, or if you don't have one, it could be the Normal ID Sample.
- `gender` is the gender of the Patient, (XX or XY)
- `status` is the status of the Patient, (0 for Normal or 1 for Tumor)
- `sample` designate the Sample, it should be the ID of the Sample (it is possible to have more than one tumor sample for each patient)
- `fastq1` is the path to the first pair of the fastq file
- `fastq2` is the path to the second pair of the fastq file
- `bam` is the bam file
- `bai` is the indexAll examples are given for a normal/tumor pair. If no tumors are listed in the TSV file, then the workflow will proceed as if it was a single normal sample instead of a normal/tumor pair.
* Example TSV file for a normal/tumor pair with FASTQ files
In this sample for the normal case there are 3 read groups, and 2 for the tumor. It is recommended to add the absolute path of the paired FASTQ files, but relative path should work also. Note, the delimiter is the tab (\t) character:
NOTE: assume each sample has only one libraray
```
G15511 XX 0 C09DFN pathToFiles/C09DFACXX111207.1_1.fastq.gz pathToFiles/C09DFACXX111207.1_2.fastq.gz
G15511 XX 1 D0ENMT pathToFiles/D0ENMACXX111207.1_1.fastq.gz pathToFiles/D0ENMACXX111207.1_2.fastq.gz
```# Contribution
Qi Zhao([email protected])