{"id":47790907,"url":"https://github.com/oicr-gsi/sample-fingerprinting","last_synced_at":"2026-04-03T15:37:46.239Z","repository":{"id":46120633,"uuid":"38776452","full_name":"oicr-gsi/sample-fingerprinting","owner":"oicr-gsi","description":"workflow that generates genotype fingerprints consumed by SampleFingerprinting workflow","archived":false,"fork":false,"pushed_at":"2025-09-11T13:58:01.000Z","size":1242,"stargazers_count":0,"open_issues_count":0,"forks_count":2,"subscribers_count":15,"default_branch":"master","last_synced_at":"2025-09-11T16:37:31.621Z","etag":null,"topics":["fingerprinting","genotyping","qc","reporting","variant-calling","workflow"],"latest_commit_sha":null,"homepage":"","language":"WDL","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/oicr-gsi.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2015-07-08T19:55:00.000Z","updated_at":"2025-09-11T13:58:06.000Z","dependencies_parsed_at":"2025-09-11T15:41:15.536Z","dependency_job_id":"49e0815d-2319-41b2-8109-367265fed7e1","html_url":"https://github.com/oicr-gsi/sample-fingerprinting","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/oicr-gsi/sample-fingerprinting","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oicr-gsi%2Fsample-fingerprinting","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oicr-gsi%2Fsample-fingerprinting/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oicr-gsi%2Fsample-fingerprinting/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oicr-gsi%2Fsample-fingerprinting/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/oicr-gsi","download_url":"https://codeload.github.com/oicr-gsi/sample-fingerprinting/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oicr-gsi%2Fsample-fingerprinting/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31360807,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-03T15:19:21.178Z","status":"ssl_error","status_checked_at":"2026-04-03T15:19:20.670Z","response_time":107,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fingerprinting","genotyping","qc","reporting","variant-calling","workflow"],"created_at":"2026-04-03T15:37:45.482Z","updated_at":"2026-04-03T15:37:46.226Z","avatar_url":"https://github.com/oicr-gsi.png","language":"WDL","funding_links":[],"categories":[],"sub_categories":[],"readme":"# fingerprintCollector\n\nFingerprintCollector 2.1, workflow that generates genotype fingerprints consumed by SampleFingerprinting workflow\n\n## Overview\n\nFingerprint Collector workflow produces \"fingerprint\" data for input alignments passed as .bam files. It is a part of the original implementation and its task is to produce all intermediate data just before creation of similarity matrix and sample swap report. The goal is to decrease the stress on the system by splitting the workflow and collecting variation data independently for each input .bam file. The below graph describes the process:\n\n![sample-fingerprinting flowchart](docs/FingerprintCollector_specs.png)\n\n## Dependencies\n\n* [gatk 4.1.7.0, gatk 3.6.0](https://gatk.broadinstitute.org)\n* [tabix 0.2.6](http://www.htslib.org)\n* [python 3.6](https://www.python.org/)\n\n\n## Usage\n\n### Cromwell\n```\njava -jar cromwell.jar run fingerprintCollector.wdl --inputs inputs.json\n```\n\n### Inputs\n\n#### Required workflow parameters:\nParameter|Value|Description\n---|---|---\n`inputBam`|File|Input lane-level BAM file\n`inputBai`|File|Index for the input BAM file\n`refFasta`|String|Path to the reference fasta file\n`hotspotSNPs`|String|Path to the gzipped hotspot vcf file\n`runHaplotypeCaller.modules`|String|Names and versions of modules\n`runDepthOfCoverage.modules`|String|Names and versions of modules\n`runFinCreator.modules`|String|Names and versions of modules\n\n\n#### Optional workflow parameters:\nParameter|Value|Default|Description\n---|---|---|---\n`outputFileNamePrefix`|String|basename(inputBam,\".bam\")|Output prefix, customizable. Default is the input file's basename.\n\n\n#### Optional task parameters:\nParameter|Value|Default|Description\n---|---|---|---\n`runHaplotypeCaller.jobMemory`|Int|8|memory allocated for Job\n`runHaplotypeCaller.timeout`|Int|24|Timeout in hours, needed to override imposed limits\n`runHaplotypeCaller.stdCC`|Float|30.0|standard call confidence score, default is 30\n`runDepthOfCoverage.jobMemory`|Int|8|memory allocated for Job\n`runDepthOfCoverage.timeout`|Int|24|Timeout in hours, needed to override imposed limits\n`runFinCreator.chroms`|Array[String]|[\"chr1\", \"chr2\", \"chr3\", \"chr4\", \"chr5\", \"chr6\", \"chr7\", \"chr8\", \"chr9\", \"chr10\", \"chr11\", \"chr12\", \"chr13\", \"chr14\", \"chr15\", \"chr16\", \"chr17\", \"chr18\", \"chr19\", \"chr20\", \"chr21\", \"chr22\", \"chrX\"]|Canonical chromosomes in desired order (used for soting lines in .fin file)\n`runFinCreator.timeout`|Int|10|Timeout in hours, needed to override imposed limits\n`runFinCreator.jobMemory`|Int|8|memory allocated for Job\n\n\n### Outputs\n\nOutput | Type | Description | Labels\n---|---|---|---\n`outputVcf`|File|gzipped vcf expression levels for all genes recorded in the reference|vidarr_label: outputVcf\n`outbutTbi`|File|expression levels for all isoforms recorded in the reference|vidarr_label: outbutTbi\n`outputFin`|File|Custom format file, shows which hotspots were called as variants|vidarr_label: outputFin\n\n\n## Commands\n \nThis section lists command(s) run by fingerprintCollector workflow\n \n* Running fingerprintCollector\n \n### GATK Haplotype Caller using a list of genotyping hotspots:\n \n```\n  gatk HaplotypeCaller\n      -R REF_FASTA\n      -I INPUT_BAM\n      -O SAMPLE_ID.snps.raw.vcf\n     --read-filter CigarContainsNoNOperator\n     --stand-call-conf STD_CC\n      -L HOTSPOT_SNPS\n \n  bgzip -c SAMPLE_ID.snps.raw.vcf \u003e SAMPLE_ID.snps.raw.vcf.gz\n  tabix -p vcf SAMPLE_ID.snps.raw.vcf.gz \n \n```\n \n### Depth of Coverage analysis:\n \n```\n  java -jar GenomeAnalysisTK.jar \n       -R REF_FASTA\n       -T DepthOfCoverage\n       -I INPUT_BAM\n       -o SAMPLE_ID\n       -filterRNC\n       -L HOTSPOT_SNPS \n \n```\n \n### Creation of a fingerprint file:\n \n```\n  ...\n \n  Custom python code producing .fin file with hotspot calls\n \n  please refer to the fingerprintCollector.wdl for source\n \n```\n\n## Support\n\nFor support, please file an issue on the [Github project](https://github.com/oicr-gsi) or send an email to gsi@oicr.on.ca .\n\n_Generated with generate-markdown-readme (https://github.com/oicr-gsi/gsi-wdl-tools/)_\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foicr-gsi%2Fsample-fingerprinting","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Foicr-gsi%2Fsample-fingerprinting","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foicr-gsi%2Fsample-fingerprinting/lists"}