https://github.com/likelet/exomepipe

nextflow ngs pipeline

Last synced: 5 months ago
JSON representation

Host: GitHub
URL: https://github.com/likelet/exomepipe
Owner: likelet
License: gpl-3.0
Created: 2017-11-30T09:21:04.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2020-09-24T01:53:17.000Z (almost 5 years ago)
Last Synced: 2025-01-07T20:45:54.748Z (6 months ago)
Topics: nextflow, ngs, pipeline
Language: Nextflow
Homepage:
Size: 195 KB
Stars: 7
Watchers: 4
Forks: 3
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# ExomePipe
## dependencies
* fastp
* bwa
* picard/samtools
* gatk4 (CNV analysis)
* mutect2
* annovar and annovar DB(deprecated)
* vep and vep DB(very huge ~15G)
* vcf2maftools
* MAFtools(only avaliable when multiple sample paired envolved)
* some in-house script
* msisensor

* Dependencies for FREEC
* Control-freeC
* bedtools
* samtools
* FACET
* snp-pileup

* NGScheckmate
* bcftools
* python
* samtools

# Input file

* `sample input file ` TSV file for sample
> adjusted from https://github.com/SciLifeLab/Sarek/blob/master/docs/INPUT.md
# TSV file for sample

Input files for ExomeSeqPipe can be specified using a tsv file given to the `--sample` parameter. The tsv file is a Tab Separated Value file with columns: `subject gender status sample lane fastq1 fastq2` or `subject gender status sample bam bai`.
The content of these columns should be quite straight-forward:

- `subject` designate the subject, it should be the ID of the Patient, or if you don't have one, it could be the Normal ID Sample.
- `gender` is the gender of the Patient, (XX or XY)
- `status` is the status of the Patient, (0 for Normal or 1 for Tumor)
- `sample` designate the Sample, it should be the ID of the Sample (it is possible to have more than one tumor sample for each patient)
- `fastq1` is the path to the first pair of the fastq file
- `fastq2` is the path to the second pair of the fastq file
- `bam` is the bam file
- `bai` is the index

All examples are given for a normal/tumor pair. If no tumors are listed in the TSV file, then the workflow will proceed as if it was a single normal sample instead of a normal/tumor pair.

* Example TSV file for a normal/tumor pair with FASTQ files

In this sample for the normal case there are 3 read groups, and 2 for the tumor. It is recommended to add the absolute path of the paired FASTQ files, but relative path should work also. Note, the delimiter is the tab (\t) character:
NOTE: assume each sample has only one libraray
```
G15511 XX 0 C09DFN pathToFiles/C09DFACXX111207.1_1.fastq.gz pathToFiles/C09DFACXX111207.1_2.fastq.gz
G15511 XX 1 D0ENMT pathToFiles/D0ENMACXX111207.1_1.fastq.gz pathToFiles/D0ENMACXX111207.1_2.fastq.gz
```

# Contribution

Qi Zhao([email protected])

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/likelet/exomepipe

Awesome Lists containing this project

README