{"id":23768565,"url":"https://github.com/nci-gdc/mutect2-cwl","last_synced_at":"2026-02-04T23:37:07.232Z","repository":{"id":53551593,"uuid":"53163972","full_name":"NCI-GDC/mutect2-cwl","owner":"NCI-GDC","description":"CWL for GDC GATK3 MuTect2","archived":false,"fork":false,"pushed_at":"2021-03-24T21:10:26.000Z","size":97,"stargazers_count":0,"open_issues_count":0,"forks_count":3,"subscribers_count":10,"default_branch":"master","last_synced_at":"2025-01-01T01:37:27.773Z","etag":null,"topics":["bioinformatics","cwl","workflow"],"latest_commit_sha":null,"homepage":"","language":"Common Workflow Language","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/NCI-GDC.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-03-04T20:38:20.000Z","updated_at":"2021-03-24T21:10:27.000Z","dependencies_parsed_at":"2022-09-09T09:50:24.452Z","dependency_job_id":null,"html_url":"https://github.com/NCI-GDC/mutect2-cwl","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NCI-GDC%2Fmutect2-cwl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NCI-GDC%2Fmutect2-cwl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NCI-GDC%2Fmutect2-cwl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NCI-GDC%2Fmutect2-cwl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/NCI-GDC","download_url":"https://codeload.github.com/NCI-GDC/mutect2-cwl/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":239946915,"owners_count":19723018,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","cwl","workflow"],"created_at":"2025-01-01T01:37:35.709Z","updated_at":"2026-02-04T23:37:07.200Z","avatar_url":"https://github.com/NCI-GDC.png","language":"Common Workflow Language","funding_links":[],"categories":[],"sub_categories":[],"readme":"# GDC GATK3 MuTect2 CWL\n![Version badge](https://img.shields.io/badge/GATK3.6-nightly--2016--02--25--gf39d340-\u003cCOLOR\u003e.svg)\n\nThe GATK3 MuTect2 pipeline employs a \"Panel of Normals\" to identify additional germline mutations. This panel is generated using TCGA blood normal genomes from thousands of individuals that were curated and confidently assessed to be cancer-free. This method allows for a higher level of confidence to be assigned to somatic variants that were called by the MuTect2 pipeline.\n\nOriginal MuTect2: https://gatkforums.broadinstitute.org/gatk/discussion/9183/how-to-call-somatic-snvs-and-indels-using-mutect2\n\n## Docker\n\nAll the docker images are built from `Dockerfile`s at https://github.com/NCI-GDC/mutect2-tool.\n\n## CWL\n\nhttps://www.commonwl.org/\n\nThe CWL are tested under multiple `cwltools` environments. The most tested one is:\n* cwltool 1.0.20180306163216\n\n\n## For external users\nThe repository has only been tested on GDC data and in the particular environment GDC is running in. Some of the reference data required for the workflow production are hosted in [GDC reference files](https://gdc.cancer.gov/about-data/data-harmonization-and-generation/gdc-reference-files \"GDC reference files\"). For any questions related to GDC data, please contact the GDC Help Desk at support@nci-gdc.datacommons.io.\n\nThere is a production-ready GDC CWL workflow at https://github.com/NCI-GDC/gdc-somatic-variant-calling-workflow, which uses this repo as a git submodule.\n\nPlease notice that you may want to change the docker image host of `dockerPull:` for each CWL.\n\nTo use CWL directly from this repo, we recommend to run\n* `tools/mutect2_pon.cwl` on the normal BAM file from the \"panel of normal\".\n* `tools/mutect2_somatic_variant.cwl` for GATK3 MuTect2 tumor/normal pair variant calling or `tools/multi_mutect2_svc.cwl` if you prefer parallelization on docker level.\n\nTo run CWL:\n\n```\n\u003e\u003e\u003e\u003e\u003e\u003e\u003e\u003e\u003e\u003eMuTect2 PON\u003c\u003c\u003c\u003c\u003c\u003c\u003c\u003c\u003c\u003c\ncwltool tools/mutect2_pon.cwl -h\n/home/ubuntu/.virtualenvs/p2/bin/cwltool 1.0.20180306163216\nResolved 'tools/mutect2_pon.cwl' to 'file:///mnt/SCRATCH/githubs/submodules/t/mutect2-cwl/tools/mutect2_pon.cwl'\nusage: tools/mutect2_pon.cwl [-h] --cont CONT --cosmic COSMIC --dbsnp DBSNP\n                             --duscb --java_heap JAVA_HEAP --normal_bam\n                             NORMAL_BAM --output_name OUTPUT_NAME --ref REF\n                             --region REGION\n                             [job_order]\n\npositional arguments:\n  job_order             Job input json file\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --cont CONT           Contamination estimation score.\n  --cosmic COSMIC       Cosmic reference file path.\n  --dbsnp DBSNP         dbSNP reference file path.\n  --duscb               Whether to use soft clipped bases, default is False.\n  --java_heap JAVA_HEAP\n                        Java heap memory.\n  --normal_bam NORMAL_BAM\n                        Normal bam file.\n  --output_name OUTPUT_NAME\n                        Output file name.\n  --ref REF             Reference fasta file.\n  --region REGION       Region used for scattering.\n\n\n\n\u003e\u003e\u003e\u003e\u003e\u003e\u003e\u003e\u003e\u003eMuTect2 tumor/normal pair variant calling\u003c\u003c\u003c\u003c\u003c\u003c\u003c\u003c\u003c\u003c\ncwltool tools/mutect2_somatic_variant.cwl -h\n/home/ubuntu/.virtualenvs/p2/bin/cwltool 1.0.20180306163216\nResolved 'tools/mutect2_somatic_variant.cwl' to 'file:///mnt/SCRATCH/githubs/submodules/t/mutect2-cwl/tools/mutect2_somatic_variant.cwl'\nusage: tools/mutect2_somatic_variant.cwl [-h] [--cont CONT] --cosmic COSMIC\n                                         --dbsnp DBSNP --duscb\n                                         [--java_heap JAVA_HEAP] --normal_bam\n                                         NORMAL_BAM --pon PON --ref REF\n                                         --region REGION --tumor_bam TUMOR_BAM\n                                         [job_order]\n\npositional arguments:\n  job_order             Job input json file\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --cont CONT           Contamination estimation score.\n  --cosmic COSMIC       Cosmic reference file path.\n  --dbsnp DBSNP         dbSNP reference file path.\n  --duscb               Whether to use soft clipped bases, default is False.\n  --java_heap JAVA_HEAP\n                        Java heap memory.\n  --normal_bam NORMAL_BAM\n                        Normal bam file.\n  --pon PON             Panel of normal reference file path.\n  --ref REF             Reference fasta file.\n  --region REGION       Region used for scattering.\n  --tumor_bam TUMOR_BAM\n                        Tumor bam file.\n```\n\n## For GDC users\n\nSee https://github.com/NCI-GDC/gdc-somatic-variant-calling-workflow.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnci-gdc%2Fmutect2-cwl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnci-gdc%2Fmutect2-cwl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnci-gdc%2Fmutect2-cwl/lists"}