{"id":16618570,"url":"https://github.com/brentp/hts-nim-tools","last_synced_at":"2025-08-09T05:32:11.963Z","repository":{"id":66472197,"uuid":"116424641","full_name":"brentp/hts-nim-tools","owner":"brentp","description":"useful command-line tools written to showcase hts-nim","archived":false,"fork":false,"pushed_at":"2020-11-10T01:22:05.000Z","size":12,"stargazers_count":49,"open_issues_count":3,"forks_count":6,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-01-17T20:46:20.864Z","etag":null,"topics":["bam","bioinformatics","genomics","nim","nim-lang","vcf","vcf-filtering"],"latest_commit_sha":null,"homepage":"https://github.com/brentp/hts-nim","language":"Nim","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/brentp.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-01-05T20:37:33.000Z","updated_at":"2024-05-29T17:34:00.000Z","dependencies_parsed_at":"2023-02-22T12:00:47.258Z","dependency_job_id":null,"html_url":"https://github.com/brentp/hts-nim-tools","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brentp%2Fhts-nim-tools","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brentp%2Fhts-nim-tools/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brentp%2Fhts-nim-tools/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brentp%2Fhts-nim-tools/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/brentp","download_url":"https://codeload.github.com/brentp/hts-nim-tools/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":242980782,"owners_count":20216285,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bam","bioinformatics","genomics","nim","nim-lang","vcf","vcf-filtering"],"created_at":"2024-10-12T02:20:42.724Z","updated_at":"2025-03-11T05:45:48.119Z","avatar_url":"https://github.com/brentp.png","language":"Nim","funding_links":[],"categories":[],"sub_categories":[],"readme":"# hts-nim-tools\n\nThis repository contains a number of tools created with [hts-nim](https://github.com/brentp/hts-nim/) intended\nto serve as examples for using `hts-nim` as well as to be useful tools.\n\nThese tools are:\n\n```\nhts-nim utility programs.\nversion: $version\n\n\t• bam-filter    : filter BAM/CRAM/SAM files with a simple expression language\n\t• count-reads   : count BAM/CRAM reads in regions given in a BED file\n\t• vcf-check     : check regions of a VCF against a background for missing chunks\n```\n\neach of these is described in more detail below.\n\n# bam-filter\n\nUse simple expressions to filter a BAM/CRAM file:\n\n```\nbam-filter\n\n  Usage: bam-filter [options] \u003cexpression\u003e \u003cBAM-or-CRAM\u003e\n\n  -t --threads \u003cthreads\u003e       number of BAM decompression threads [default: 0]\n  -f --fasta \u003cfasta\u003e           fasta file for use with CRAM files [default: $env_fasta].\n```\n\nvalid expressions may access the bam attibutes:\n\n+  `mapq `/ `start `/ `pos `/ `end `/ `flag `/ `insert_size ` (where pos is the 1-based start)\n+ `is_aligned` `is_read1` `is_read2` `is_supplementary` `is_secondary` `is_dup` `is_qcfail`\n+ `is_reverse` `is_mate_reverse` `is_pair` `is_proper_pair` `is_mate_unmapped` `is_unmapped`\n\nto use aux tags, indicate them prefixed with 'tag_', e.g.:\n\n  tag_NM \u003c 2. Any tag present in the bam can be used in this manner.\n\nexample:\n```\nbam-filter \"tag_NM == 2 \u0026\u0026 tag_RG == 'SRR741410' \u0026\u0026 is_proper_pair\" tests/HG02002.bam\n```\n\n# count-reads\n\nCount reads reports the number of reads overlapping each interval in a BED file.\n\n```\ncount-reads\n\n  Usage: count-reads [options] \u003cBED\u003e \u003cBAM-or-CRAM\u003e\n\nArguments:                                                                                                                                                 \n\n  \u003cBED\u003e          the bed file containing regions in which to count reads.\n  \u003cBAM-or-CRAM\u003e  the alignment file for which to calculate depth.\n\nOptions:\n\n  -t --threads \u003cthreads\u003e      number of BAM decompression threads [default: 0]\n  -f --fasta \u003cfasta\u003e          fasta file for use with CRAM files [default: ].\n  -F --flag \u003cFLAG\u003e            exclude reads with any of the bits in FLAG set [default: 1796]\n  -Q --mapq \u003cmapq\u003e            mapping quality threshold [default: 0]\n  -h --help                     show help\n```\n\nThis is output a line with a count of reads for each line in \u003cBED\u003e.\n\n# vcf-check\n\n`vcf-check` is useful as a quality control for large projects which have done variant calling in regions\nwhere each region is called in parallel. With many regions, and large projects, some regions can error and\nthis might be unknown to the analyst.\n\nThis tools takes a background VCF, such as gnomad, that has full genome (though in some cases, users will\ninstead want whole exome) coverage and uses that as an expectation of variants. **If the background has many\nvariants across a long stretch of genome where the query VCF has no variation, we can expect that region is\nmissed in the query VCF.**\n\n```\nCheck a VCF against a background to make sure that there are no large missing chunks.\n\n  vcf-check\n\n  Usage: vcf-check [options] \u003cBACKGROUND_VCF\u003e \u003cVCF\u003e\n\nArguments:                                                                                                                                                 \n  \u003cBACKGROUND_VCF\u003e        population VCF/BCF with expected sites\n  \u003cVCF\u003e                   query VCF/BCF to check\n\nOptions:\n\n  -c --chunk \u003cINT\u003e        chunk size for genome [default: 100000]\n  -m --maf \u003cFLOAT\u003e        allele frequency  cutoff [default: 0.1]\n```\n\nThis will output a tab-delimited file of `chrom\\tposition\\tbackground-count\\tquery-count`.\n\nThe user can find regions that might be problematic by plotting or with some simple `awk` commands.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrentp%2Fhts-nim-tools","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbrentp%2Fhts-nim-tools","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrentp%2Fhts-nim-tools/lists"}