{"id":32204631,"url":"https://github.com/ngthomas/microhaplot","last_synced_at":"2025-10-22T04:59:47.054Z","repository":{"id":49202000,"uuid":"66886999","full_name":"ngthomas/microhaplot","owner":"ngthomas","description":"microhaplotype visualizer and analyzer","archived":false,"fork":false,"pushed_at":"2021-06-24T08:11:40.000Z","size":20752,"stargazers_count":20,"open_issues_count":4,"forks_count":7,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-10-22T04:59:43.454Z","etag":null,"topics":["amplicon-sequencing","microhaplot-shiny","shiny","vcf"],"latest_commit_sha":null,"homepage":"https://ngthomas.github.io/microhaplot","language":"Perl","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ngthomas.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-08-29T23:01:11.000Z","updated_at":"2025-10-08T00:49:53.000Z","dependencies_parsed_at":"2022-09-05T11:32:15.515Z","dependency_job_id":null,"html_url":"https://github.com/ngthomas/microhaplot","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/ngthomas/microhaplot","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ngthomas%2Fmicrohaplot","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ngthomas%2Fmicrohaplot/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ngthomas%2Fmicrohaplot/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ngthomas%2Fmicrohaplot/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ngthomas","download_url":"https://codeload.github.com/ngthomas/microhaplot/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ngthomas%2Fmicrohaplot/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":280382997,"owners_count":26321423,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-22T02:00:06.515Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["amplicon-sequencing","microhaplot-shiny","shiny","vcf"],"created_at":"2025-10-22T04:59:39.726Z","updated_at":"2025-10-22T04:59:47.046Z","avatar_url":"https://github.com/ngthomas.png","language":"Perl","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts_chunk$set(echo = TRUE)\n```\n\n# microhaplot \u003cimg src=\"https://i.ibb.co/68M0tpT/microhaplot-logo.png\" align=\"right\" width=\"200\"/\u003e\n\n\u003c!-- badges: start --\u003e\n  [![CRAN status](https://www.r-pkg.org/badges/version/microhaplot)](https://CRAN.R-project.org/package=microhaplot)\n\u003c!-- badges: end --\u003e\n  \n`microhaplot` generates visual summaries of microhaplotypes found in short read alignments. All you need are alignment SAM \nfiles and a variant call VCF file. (The latter tells `microhaplot` which SNPs to include into microhaplotypes).  It was \ndesigned for extracting and visualized haplotypes from high-quality amplicon sequencing data.  We have used it extensively\nto process amplicon sequencing data (with 100 to 500 amplicons) from rockfish and Chinook salmon, generated on an Illumina \nMiSeq sequencer.  It should be extensible to sequences from capture arrays, like RAPTURE data.\n\nThis software exists as an R package `microhaplot` that includes within it the code to set up and \nestablish an Rstudio/Shiny server to visualize and manipulate the data.  There are two key steps in \nthe `microhaplot` workflow:\n\n1. The first step is to summarize alignment and variant (SNP) data into a single data frame that is \neasily operated upon.  This is done using the function `microhaplot::prepHaplotFiles`.  You must supply a \nVCF file that includes variants that you are interested in extracting, and as many SAM files \n(one for each individual) that you want to extract read information from at each of the variants. \nThe function `microhaplot::prepHaplotFiles` makes a call\nto PERL to parse the CIGAR strings in the SAM files to extract the variant information at each read\nand store this information into a data frame which gets saved with the installed Shiny app (see below)\nfor later use.  Depending on the size of the data set, this can take a few minutes.  \n\n2. The second step is to run the microhaplot Shiny app to visualize the sequence information, call genotypes using\nsimple read-depth based filtering criteria, and curate the loci. microhaplot is suitable for quick assessment\nand quality control of haplotypes generated from library runs. Plot summaries include read depth, fraction of callable haplotypes, Hardy-Weinberg\nequilibrium plots, and more. \n\n   \n\u003ccenter\u003e\u003cimg src=\"https://i.ibb.co/F5JtHj1/microhaplot-demo-1.gif\" align=\"center\" width=\"500\"/\u003e\u003c/center\u003e\n   \n     \nSee the **Example Data** section to learn about how to run each of these steps on the example data that are provided\nwith the package.  \n\n   \n## Installation and Quick Start\n\n### required Perl dependencies:\nYou need to have Perl (version \u003e5.014) installed in your OS in order to run Microhaplot.  \nFor Window users, we recommend install it via http://strawberryperl.com/.  \nFor Mac and Linux users, Perl can be downloaded from https://www.perl.org/get.html  \n\nYou can either clone the repository and build the `microhaplot` package yourself, or, more easily, you can\ninstall it using  [devtools](https://github.com/hadley/devtools). You can get `devtools` by `install.packages(\"devtools\")`.\n  \n**To mac user: remember to install [XQuartz](https://www.xquartz.org/), when upgrading your macOS to a new major version.**   \n \nOnce you have `devtools` available in R, you can get `microhaplot` this way:\n```r\ndevtools::install_github(\"ngthomas/microhaplot\", build_vignettes = TRUE, build_opts = c(\"--no-resave-data\", \"--no-manual\"))\n```\n\nOnce you have installed the `microhaplot` R package with devtools there you need to use the `microhaplot::mvHaplotype`\nto establish the microhaplot Shiny App in a convenient location on your system. The following line\ncreates the directory `Shiny` in my home directory and then within that it creates the \ndirectory `microhaplot` and fills it with the Shiny app as well as the example data that go \nalong with that.  \n\n```r\nmicrohaplot::mvShinyHaplot(\"~/Shiny\") # provide a directory path to host the microhaplot app\n```\nTo start familiarizing yourself with microhaplot using the provided example data.  We recommend\ngoing through our first vignette.  Call it up with:\n```r\nbrowseVignettes(\"microhaplot\")\n```\nand check out `microhaplot-walkthrough`.\n\nNow, having done that, we can launch Shiny microhaplot on the example data:\n```r\nlibrary(microhaplot)\napp.path \u003c- \"~/Shiny/microhaplot\"\nrunShinyHaplot(app.path)\n```\n\n## Quick Guide to use microhaplot to parse out SAM and VCF files\n\nThis microhaplot package comes with a small customized sample data drawn from an actual run \nof short read sequencing run on Rockfish species. The sample data\ncontains sequences of eight genomic loci for four populations of five individuals each, \nwith a total of twenty individuals. \n\nFirst you need to create a tab-separate **label** file with 3 info columns: path to SAM file name, individual ID, and group label (in this particular order). If you do not want assign any group label for the individuals, you can just leave it as \"NA\". It is recommended that you have all of the SAM files under one directory to make this labeling task easier.\n\nThe `label` file looks like this:\n```txt\ns6.sam  s6      copper\ns11.sam s11     copper\ns13.sam s13     gold\ns14.sam s14     kelp\ns18.sam s18     gold\n``` \n\nOnce you have the label file in place, you can run `prepHaplotFiles`, a R function that generates tables of microhaplotype, by providing the following:\n * a label to display in haPLOType\n * path to the directory with all SAM files \n * path to the `label` file you just created\n * path to the VCF file  \n * optional number of threads (for non-Windows user); recommend 2 * # of processors \n \n```R\nlibrary(microhaplot)\n\n# to access package sample case study dataset of rockfish\nrun.label \u003c- \"sebastes\"\n\nsam.path \u003c- tempdir()\nuntar(system.file(\"extdata\",\n                  \"sebastes_sam.tar.gz\",\n                  package=\"microhaplot\"),\n      exdir = sam.path)\n      \nlabel.path \u003c- file.path(sam.path, \"label.txt\")\nvcf.path \u003c- file.path(sam.path, \"sebastes.vcf\")\nout.path \u003c- tempdir()\napp.path \u003c- \"~/Shiny/microhaplot\"\n\n# for your dataset: customize the following paths\n# sam.path \u003c- \"~/microhaplot/extdata/\"\n# label.path \u003c- \"~/microhaplot/extdata/label.txt\"\n# vcf.path \u003c- \"~/microhaplot/extdata/sebastes.vcf\"\n# app.path \u003c- \"~/Shiny/microhaplot\"\n\nhaplo.read.tbl \u003c- prepHaplotFiles(run.label = run.label,\n                            sam.path = sam.path,\n                            out.path = out.path,\n                            label.path = label.path,\n                            vcf.path = vcf.path,\n                            app.path = app.path,\n                            n.jobs = 4) # assume running on dual core\n                            \n\nrunShinyHaplot(app.path)\n```\n\n\n## Suggestions\n- SAM files: For pair-ended experiment, both directional reads should be flashed into one.\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fngthomas%2Fmicrohaplot","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fngthomas%2Fmicrohaplot","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fngthomas%2Fmicrohaplot/lists"}