An open API service indexing awesome lists of open source software.

https://github.com/stevekm/mutect2_target_chunking

demo pipeline for testing different data chunking methods for MuTect2
https://github.com/stevekm/mutect2_target_chunking

mutect2 nextflow variant-calling

Last synced: 4 months ago
JSON representation

demo pipeline for testing different data chunking methods for MuTect2

Awesome Lists containing this project

README

          

# MuTect2 Target Chunking

Demo pipeline for testing different data chunking methods for MuTect2.

[MuTect2](https://software.broadinstitute.org/gatk/documentation/tooldocs/3.8-0/org_broadinstitute_gatk_tools_walkers_cancer_m2_MuTect2.php) is a common tool used for variant calling of tumor-normal pairs. However, it is limited to running only in single-threaded mode, which can lead to extremely long execution times.

This demo pipeline uses different techniques to chunk the included list of target regions (`targets.bed`) into smaller segments to run in parallel, then aggregate all results for comparison to ensure that variant calls are the same across all chunking methods.

# Usage

This pipeline comes pre-configured for usage on NYULMC's Big Purple HPC cluster using pre-built Singularity containers and pre-downloaded reference files.

In order to use this pipeline on your system you will need to update the file paths saved in `nextflow.config` for your system.

Singularity and Docker container recipes are included in the `containers` directory.

Paths to input .bam files for tumor and normal samples are read from the file `samples.analysis.tsv`.

Once correctly configured, the pipeline can be run with:

```
make run
```