{"id":23768561,"url":"https://github.com/nci-gdc/mutect2-tool","last_synced_at":"2026-05-16T11:34:37.167Z","repository":{"id":52691853,"uuid":"49169837","full_name":"NCI-GDC/mutect2-tool","owner":"NCI-GDC","description":"GDC GATK3 MuTect2 docker","archived":false,"fork":false,"pushed_at":"2025-06-10T15:39:51.000Z","size":11767,"stargazers_count":0,"open_issues_count":4,"forks_count":2,"subscribers_count":9,"default_branch":"master","last_synced_at":"2025-09-07T06:23:09.789Z","etag":null,"topics":["bioinformatics","docker","workflow-tool"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/NCI-GDC.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":"NOTICE","maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2016-01-07T00:15:30.000Z","updated_at":"2021-04-20T21:43:24.000Z","dependencies_parsed_at":"2025-09-07T06:15:33.169Z","dependency_job_id":"e13d7218-1855-469f-b66c-6e2587c38ea3","html_url":"https://github.com/NCI-GDC/mutect2-tool","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/NCI-GDC/mutect2-tool","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NCI-GDC%2Fmutect2-tool","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NCI-GDC%2Fmutect2-tool/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NCI-GDC%2Fmutect2-tool/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NCI-GDC%2Fmutect2-tool/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/NCI-GDC","download_url":"https://codeload.github.com/NCI-GDC/mutect2-tool/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NCI-GDC%2Fmutect2-tool/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33100922,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-16T04:41:52.686Z","status":"ssl_error","status_checked_at":"2026-05-16T04:41:52.009Z","response_time":115,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","docker","workflow-tool"],"created_at":"2025-01-01T01:37:35.079Z","updated_at":"2026-05-16T11:34:37.150Z","avatar_url":"https://github.com/NCI-GDC.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# GDC GATK3 MuTect2\n![Version badge](https://img.shields.io/badge/GATK3.6-nightly--2016--02--25--gf39d340-\u003cCOLOR\u003e.svg)\n\nThe GATK3 MuTect2 pipeline employs a \"Panel of Normals\" to identify additional germline mutations. This panel is generated using TCGA blood normal genomes from thousands of individuals that were curated and confidently assessed to be cancer-free. This method allows for a higher level of confidence to be assigned to somatic variants that were called by the MuTect2 pipeline.\n\nOriginal GATK3 MuTect2: https://gatkforums.broadinstitute.org/gatk/discussion/9183/how-to-call-somatic-snvs-and-indels-using-mutect2\n\n## GATK3\n\nImportant note:\n\n* The GDC GATK MuTect2 version was frozen to the version when we delivered our first data release. GATK team normally do not keep nightly version beyond 30 days, so that it makes really difficult to re-build the identical docker image.\u003cbr\u003e\nHowever, according to GATK team, it seems reasonable to use GATK3.7 as a replacement.\u003cbr\u003e\nhttps://gatkforums.broadinstitute.org/gatk/discussion/9406/where-can-i-find-the-gdc-mutect2-version\n* Please contact GATK team for the GATK3.7 `GenomeAnalysisTK.jar`.\n\n## How to build\n\nhttps://docs.docker.com/engine/reference/builder/\n\nThe docker images are tested under multiple environments. The most tested ones are:\n* Docker version 19.03.2, build 6a30dfc\n* Docker version 18.09.1, build 4c52b90\n* Docker version 18.03.0-ce, build 0520e24\n* Docker version 17.12.1-ce, build 7390fc6\n\n## For external users\nThe repository has only been tested on GDC data and in the particular environment GDC is running in. Some of the reference data required for the workflow production are hosted in [GDC reference files](https://gdc.cancer.gov/about-data/data-harmonization-and-generation/gdc-reference-files \"GDC reference files\"). For any questions related to GDC data, please contact the GDC Help Desk at support@nci-gdc.datacommons.io.\n\nThere is a production-ready CWL example at https://github.com/NCI-GDC/mutect2-cwl which uses the docker images that are built from the `Dockerfile`s in this repo.\n\nTo run multi-threading GATK3 MuTect2:\n\n```\n[INFO] [20200109 04:10:13] [multi_mutect2] - --------------------------------------------------------------------------------\n[INFO] [20200109 04:10:13] [multi_mutect2] - multi_mutect2_p3.py\n[INFO] [20200109 04:10:13] [multi_mutect2] - Program Args: docker/multi_mutect2/multi_mutect2.py -h\n[INFO] [20200109 04:10:13] [multi_mutect2] - --------------------------------------------------------------------------------\nusage: Internal multithreading MuTect2 calling. [-h] -j JAVA_HEAP -f\n                                                REFERENCE_PATH -r\n                                                INTERVAL_BED_PATH -t TUMOR_BAM\n                                                -n NORMAL_BAM -c THREAD_COUNT\n                                                -p PON -s COSMIC -d DBSNP -e\n                                                CONTEST [-m]\n\noptional arguments:\n  -h, --help            show this help message and exit\n  -j JAVA_HEAP, --java_heap JAVA_HEAP\n                        Java heap memory.\n  -f REFERENCE_PATH, --reference_path REFERENCE_PATH\n                        Reference path.\n  -r INTERVAL_BED_PATH, --interval_bed_path INTERVAL_BED_PATH\n                        Interval bed file.\n  -t TUMOR_BAM, --tumor_bam TUMOR_BAM\n                        Tumor bam file.\n  -n NORMAL_BAM, --normal_bam NORMAL_BAM\n                        Normal bam file.\n  -c THREAD_COUNT, --thread_count THREAD_COUNT\n                        Number of thread.\n  -p PON, --pon PON     Panel of normals reference path.\n  -s COSMIC, --cosmic COSMIC\n                        Cosmic reference path.\n  -d DBSNP, --dbsnp DBSNP\n                        dbSNP reference path.\n  -e CONTEST, --contest CONTEST\n                        Contamination estimation value from ContEst.\n  -m, --dontUseSoftClippedBases\n                        If specified, it will not analyze soft clipped bases\n                        in the reads.\n```\n\n## For GDC users\n\nSee https://github.com/NCI-GDC/gdc-somatic-variant-calling-workflow.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnci-gdc%2Fmutect2-tool","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnci-gdc%2Fmutect2-tool","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnci-gdc%2Fmutect2-tool/lists"}