{"id":32209912,"url":"https://github.com/zhanxw/seqminer","last_synced_at":"2026-02-18T21:03:20.701Z","repository":{"id":56934125,"uuid":"14078358","full_name":"zhanxw/seqminer","owner":"zhanxw","description":"Query sequence data (VCF/BCF1/BCF2, Tabix, BGEN, PLINK) in R","archived":false,"fork":false,"pushed_at":"2025-10-01T01:03:36.000Z","size":8386,"stargazers_count":32,"open_issues_count":18,"forks_count":12,"subscribers_count":2,"default_branch":"master","last_synced_at":"2026-01-12T20:21:35.547Z","etag":null,"topics":["annotation","bcf","bgen","meta-analysis","next-generation-sequencing","plink","sequencing","tabix","vcf","workflow"],"latest_commit_sha":null,"homepage":"http://zhanxw.github.io/seqminer/","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zhanxw.png","metadata":{"files":{"readme":"README.md","changelog":"ChangeLog","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2013-11-03T01:33:16.000Z","updated_at":"2025-10-25T21:27:38.000Z","dependencies_parsed_at":"2023-02-15T18:16:06.093Z","dependency_job_id":"cf8ba37f-3cf8-49ec-afb9-b6cc588e8c39","html_url":"https://github.com/zhanxw/seqminer","commit_stats":{"total_commits":192,"total_committers":8,"mean_commits":24.0,"dds":"0.22916666666666663","last_synced_commit":"86012b04058fcfd0727a5910460e15d2c65a6d4c"},"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/zhanxw/seqminer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zhanxw%2Fseqminer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zhanxw%2Fseqminer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zhanxw%2Fseqminer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zhanxw%2Fseqminer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zhanxw","download_url":"https://codeload.github.com/zhanxw/seqminer/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zhanxw%2Fseqminer/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29596127,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-18T20:59:56.587Z","status":"ssl_error","status_checked_at":"2026-02-18T20:58:41.434Z","response_time":162,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["annotation","bcf","bgen","meta-analysis","next-generation-sequencing","plink","sequencing","tabix","vcf","workflow"],"created_at":"2025-10-22T06:20:32.692Z","updated_at":"2026-02-18T21:03:20.693Z","avatar_url":"https://github.com/zhanxw.png","language":"C","readme":"SEQMINER2\n========\n\n[![R-CMD-check](https://github.com/zhanxw/seqminer/workflows/R-CMD-check/badge.svg)](https://github.com/zhanxw/seqminer/actions)\n[![AppVeyor build status](https://ci.appveyor.com/api/projects/status/github/zhanxw/seqminer?branch=master\u0026svg=true)](https://ci.appveyor.com/project/zhanxw/seqminer)\n![](https://cranlogs.r-pkg.org/badges/seqminer)\n[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/seqminer)](https://cran.r-project.org/package=seqminer)\n\n\n**Table of Contents**\n\n- [Introduction](#introduction)\n- [Download](#download)\n- [Showcase](#showcase)\n  - [Index VCF/BCF files](#Index-VCF/BCF-files)\n  - [Query VCF/BCF files](#Query-VCF/BCF-files)\n  - [Query BGEN/PLINK files](#query-BGEN/PLINK-files)\n  - [Command line linterface](#command-line-linterface)\n\n# Introduction\nSeqminer is a highly efficient R-package for retrieving sequence variants from biobank scale datasets of millions of individuals and billions of genetic variants. It supports all variant types, including multi-allelic variants and imputation dosages. It takes VCF/BCF/BGEN/PLINK format as input file, indexes, queries them based upon variant-based index and loads them as R data types such as list or matrix.\n\n# Download\nInstall the development version ([devtools](https://github.com/r-lib/devtools) package is required):\n\n    devtools::install_github(\"zhanxw/seqminer\")\n\n# Showcase\nHere are some examples of how to use seqminer to index and query files in real-life scenarios.\n\n## Index VCF/BCF files\n\n    library(seqminer)\n    bcf.ref.file \u003c- \"input.bcf\"\n    bcf.idx.file \u003c- \"input.bcf.scIdx\"\n    out \u003c- seqminer::createSingleChromosomeBCFIndex(bcf.ref.file, bcf.idx.file)\n\nor\n\n    vcf.ref.file \u003c- \"input.vcf.gz\"\n    vcf.idx.file \u003c- \"input.vcf.gz.scIdx\"\n    out \u003c- seqminer::createSingleChromosomeVCFIndex(vcf.ref.file, vcf.idx.file)\n\nThis would generate variant-based index that works with commonly used sequence variant file format, such as VCF/BCF files.\n\n## Query VCF/BCF files\n\nQuery VCF file:\n\n    vcf.ref.file \u003c-  \"input.vcf.gz\"\n    vcf.idx.file \u003c-  \"input.vcf.gz.scIdx\"\n    tabix.range \u003c- \"1:123-1234\"\n    geno \u003c- seqminer::readSingleChromosomeVCFToMatrixByRange(vcf.ref.file, tabix.range, vcf.idx.file)\n\nQuery BCF file:\n\n    bcf.ref.file \u003c- \"input.bcf\"\n    bcf.idx.file \u003c- \"input.bcf.scIdx\"\n    tabix.range \u003c- \"1:123-1234\"\n    geno \u003c- seqminer::readSingleChromosomeBCFToMatrixByRange(bcf.ref.file, tabix.range, bcf.idx.file)\n\nQuerying multiple regions is also doable, simply specify multiple regions and separte them by a comma, e.g. `\"1:123-124,1:1234-1235\"`.\n\nOutput example (column represents variants, row represents individuals):\n\n\u003cimg src=\"https://github.com/yang-lina/seqminer/blob/master/output.png\" width=\"60%\"\u003e\n\n## Query BGEN/PLINK files\n\nQuery BGEN file:\n\n    bg.ref.file \u003c- \"input.bgen\"\n    bg.range \u003c- \"1:123-1234\"\n    geno.mat \u003c- seqminer::readBGENToMatrixByRange(bg.ref.file, bg.range)\n    geno.list \u003c- seqminer::readBGENToListByRange(bg.ref.file, bg.range)\nMake sure that bgen file has an index file `*.bgi` in the same folder.\n\nQuery PLINK file:\n\n    plink.ref.file \u003c- \"input\"\n    geno \u003c- seqminer::readPlinkToMatrixByIndex(plink.ref.file, sampleIndex=1:20000, markerIndex=1:100)\n\n## Command line linterface\nWe also developed a seqminer command line interface:\n\n    ./queryVCFIndex.intel input.vcf.gz input.vcf.gz.scIdx 1:123-1234\n\nCitation: \n\n[Yang, L., Jiang, S., Jiang, B., Liu, D. J., \u0026 Zhan, X. (2020). Seqminer2: An Efficient Tool to Query and Retrieve Genotypes for Statistical Genetics Analyses from Biobank Scale Sequence Dataset. Bioinformatics](https://doi.org/10.1093/bioinformatics/btaa628)\n\n[Zhan, X. and Liu, D. J. (2015), SEQMINER: An R-Package to Facilitate the Functional Interpretation of Sequence-Based Associations. Genet. Epidemiol., 39: 619–623. doi:10.1002/gepi.21918](https://doi.org/10.1002/gepi.21918)\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzhanxw%2Fseqminer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzhanxw%2Fseqminer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzhanxw%2Fseqminer/lists"}