{"id":32210067,"url":"https://github.com/knausb/vcfr","last_synced_at":"2025-10-22T06:24:38.753Z","repository":{"id":11466884,"uuid":"13932575","full_name":"knausb/vcfR","owner":"knausb","description":"Tools to work with variant call format files","archived":false,"fork":false,"pushed_at":"2025-02-21T22:15:14.000Z","size":21966,"stargazers_count":262,"open_issues_count":41,"forks_count":54,"subscribers_count":15,"default_branch":"master","last_synced_at":"2025-10-20T21:41:10.456Z","etag":null,"topics":["genomics","population-genetics","population-genomics","rcpp","rstats","vcf-data","visualization"],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/knausb.png","metadata":{"files":{"readme":"README.md","changelog":"NEWS.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2013-10-28T17:09:04.000Z","updated_at":"2025-09-09T00:49:38.000Z","dependencies_parsed_at":"2024-06-19T01:45:35.695Z","dependency_job_id":"9d1c6615-10c1-4618-bc6e-9737cd3eada0","html_url":"https://github.com/knausb/vcfR","commit_stats":{"total_commits":1335,"total_committers":14,"mean_commits":95.35714285714286,"dds":0.03146067415730336,"last_synced_commit":"a160996e35437a657774e2769d43ea2ee960d24c"},"previous_names":[],"tags_count":15,"template":false,"template_full_name":null,"purl":"pkg:github/knausb/vcfR","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/knausb%2FvcfR","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/knausb%2FvcfR/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/knausb%2FvcfR/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/knausb%2FvcfR/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/knausb","download_url":"https://codeload.github.com/knausb/vcfR/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/knausb%2FvcfR/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":280391601,"owners_count":26322952,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-22T02:00:06.515Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["genomics","population-genetics","population-genomics","rcpp","rstats","vcf-data","visualization"],"created_at":"2025-10-22T06:24:36.585Z","updated_at":"2025-10-22T06:24:38.742Z","avatar_url":"https://github.com/knausb.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n## VcfR: a package to manipulate and visualize [VCF](https://github.com/samtools/hts-specs) data in R\n\n\nOn CRAN:\n[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/vcfR)](https://cran.r-project.org/package=vcfR)\n[![](http://cranlogs.r-pkg.org/badges/grand-total/vcfR)](https://cran.r-project.org/package=vcfR)\n[![](http://cranlogs.r-pkg.org/badges/vcfR)](https://cran.r-project.org/package=vcfR)\n\nAppveyor (Windows):\n[![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/knausb/vcfR?branch=master\u0026svg=true)](https://ci.appveyor.com/project/knausb/vcfR)\n\nCoveralls:\n[![Coverage Status](https://coveralls.io/repos/github/knausb/vcfR/badge.svg?branch=master)](https://coveralls.io/github/knausb/vcfR?branch=master)\n\n\u003c!-- badges: start --\u003e\n[![R-CMD-check](https://github.com/knausb/vcfR/workflows/R-CMD-check/badge.svg)](https://github.com/knausb/vcfR/actions)\n\u003c!-- badges: end --\u003e\n\n\n*****\n\n\n![Supercontig_50](tools/pinfsc50.png)\n\n\nVcfR is an R package intended to allow easy manipulation and visualization of variant call format (VCF) data.\nFunctions are provided to rapidly read from and write to VCF files.\nOnce VCF data is read into R a parser function extracts matrices from the VCF data for use with typical R functions.\nThis information can then be used for quality control or other purposes.\nAdditional functions provide visualization of genomic data.\nOnce processing is complete data may be written to a VCF file or converted into other popular R objects (e.g., genlight, DNAbin).\nVcfR provides a link between VCF data and the R environment connecting familiar software with genomic data.\n\n\nVcfR is built upon two data structures.\n\n**vcfR** - S4 class to contain data from a VCF file.\n\n**chromR** - S4 class to contain variant information (VCF) and optional sequence (FASTA) and annotation (GFF) information.\n\n\nFunctions in vcfR provide the ability to subset VCF data as well as to extract and parse the data.\nFor example, individual genotypes, sequence depths or genotype likelihoods (when provided in the VCF file) can easily be accessed.\nThese tools are provided to aid researchers in rapidly surveying the quality and other characteristics of data provided as VCF data.\nWith this information in hand, researchers should be able to determine criteria for hard filtering in order to attempt to maximize biological variation and minimize technical variation.\n\n\n## Documentation\n\nDocumentation for vcfR can now be found here: [vcfR_documentation](https://knausb.github.io/vcfR_documentation/).\n\nWe also have [Population genetics and genomics in R](https://grunwaldlab.github.io/Population_Genetics_in_R/index.html) which is more general and provides examples of analyses.\n\nIf you think you've found a bug, please see [reporting an issue](https://knausb.github.io/vcfR_documentation/reporting_issue.html).\n\n## Publication\n\n### vcfR articles\n\nKnaus, Brian J., and Niklaus J. Grunwald. 2017. VCFR: a package to manipulate and visualize variant call format data in R. Molecular Ecology Resources 17(1):44-53. http://dx.doi.org/10.1111/1755-0998.12549.\n\nKnaus, Brian J., and Niklaus J. Grunwald. 2016. VcfR: an R package to manipulate and visualize VCF format data. bioRxiv: 041277. http://dx.doi.org/10.1101/041277.\n\n\n### Copy number variation article\n\nKnaus, Brian, and Niklaus J. Grünwald. 2018. Inferring variation in copy number using high throughput sequencing data in R. Frontiers in Genetics 9: 123. http://dx.doi.org/10.3389/fgene.2018.00123.\n\n\n## Download\n\n[vcfR](https://cran.r-project.org/package=vcfR) is available at CRAN.\nTo install use:\n\n    install.packages('vcfR')\n\n\n\nThe development version can be installed through github:\n\n    devtools::install_github(repo=\"knausb/vcfR\")\n    library(vcfR)\n\n\nIf you would like the vignettes use:\n\n    devtools::install_github(repo=\"knausb/vcfR\", build_vignettes=TRUE)\n\n\nIf you've built the vignettes, you can browse them with:\n\n    browseVignettes(package=\"vcfR\")\n\n\nIf you've installed this package with devtools you will probably need to run:\n\n    devtools::install(build_vignettes = TRUE)\n    \n\n------\n\n## Devel branch\n\nThe devel branch (which may not be stable) can also be installed:\n\n    devtools::install_github(repo=\"knausb/vcfR@devel\")\n    library(vcfR)\n\n\nAnd to build the vignettes:\n\n    devtools::install_github(repo=\"knausb/vcfR@devel\", build_vignettes=TRUE)\n\n\n------\n\n## Software that produce VCF files\n\n\nA fun part of this project has been learning about how people use vcfR.\nOne facet of this is learning about the software that create VCF files.\nSo I've decided to make a list of these software.\nIf you know of a software that I have not included on this list, particularly if you can report that vcfR works with its files, feel free to let me know!\n\n\n**Genomic variant callers:**\n\n* [Cortex](https://cortexassembler.sourceforge.net/)\n* [freebayes](https://github.com/freebayes/freebayes)\n* [GATK haplotype caller](https://gatk.broadinstitute.org)\n* [GATK MuTect2](https://gatk.broadinstitute.org)\n* [GATK GenotypeGVCFs](https://gatk.broadinstitute.org)\n* [LoFreq](http://csb5.github.io/lofreq/)\n* [Samtools](http://www.htslib.org/)\n* [VarScan2](http://dkoboldt.github.io/varscan/)\n\n\n**Restriction site associated DNA markers (e.g., RADseq, GBS):**\n\n* [Stacks](http://catchenlab.life.illinois.edu/stacks/)\n* [Tassel](https://www.maizegenetics.net)\n\n**Manipulation of VCF data:**\n\n* [Beagle v4.1](https://faculty.washington.edu/browning/beagle/beagle.html) Inputs VCF genotypes and outputs phased genotypes to VCF format\n* [pegas::read.vcf](https://cran.r-project.org/package=pegas) Population and Evolutionary Genetics Analysis System\n* [PyVCF](https://pyvcf.readthedocs.io/en/latest/)\n* [SnpEff](https://snpeff.sourceforge.net/) Genetic variant annotation and effect prediction toolbox\n* [Picard](http://broadinstitute.github.io/picard/index.html) A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF\n* [VCF-kit](https://github.com/AndersenLab/VCF-kit) VCF-kit is a command-line based collection of utilities for performing analysis on Variant Call Format (VCF) files.\n* [VCFtools](https://vcftools.github.io/) General manipulation and analysis\n* [VariantAnnotation::readVcf](https://bioconductor.org/packages/release/bioc/html/VariantAnnotation.html) Bioconductor package for annotating variants\n\n**R packages that read VCF data**\n\n* [VariantAnnotation](https://bioconductor.org/packages/release/bioc/html/VariantAnnotation.html)\n* [pegas](https://cran.r-project.org/package=pegas)\n\n------\n\nEnjoy!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fknausb%2Fvcfr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fknausb%2Fvcfr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fknausb%2Fvcfr/lists"}