{"id":27302947,"url":"https://github.com/tseemann/snippy","last_synced_at":"2025-04-12T02:49:15.886Z","repository":{"id":17038358,"uuid":"19802637","full_name":"tseemann/snippy","owner":"tseemann","description":":scissors: :zap: Rapid haploid variant calling and core genome alignment","archived":false,"fork":false,"pushed_at":"2024-07-19T02:05:50.000Z","size":130456,"stargazers_count":504,"open_issues_count":215,"forks_count":117,"subscribers_count":29,"default_branch":"master","last_synced_at":"2025-04-03T11:55:54.263Z","etag":null,"topics":["bacteria","bioinformatics","fastq-analysis","genomics","haploid","indel-discovery","snps","variant-calling","vcf"],"latest_commit_sha":null,"homepage":"","language":"Perl","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tseemann.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2014-05-15T01:56:12.000Z","updated_at":"2025-04-01T01:16:48.000Z","dependencies_parsed_at":"2022-07-14T22:17:01.701Z","dependency_job_id":"2bd3cf1e-e434-47c7-a08a-8a414cd1e033","html_url":"https://github.com/tseemann/snippy","commit_stats":null,"previous_names":[],"tags_count":49,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tseemann%2Fsnippy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tseemann%2Fsnippy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tseemann%2Fsnippy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tseemann%2Fsnippy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tseemann","download_url":"https://codeload.github.com/tseemann/snippy/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248508897,"owners_count":21115890,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bacteria","bioinformatics","fastq-analysis","genomics","haploid","indel-discovery","snps","variant-calling","vcf"],"created_at":"2025-04-12T02:49:15.316Z","updated_at":"2025-04-12T02:49:15.858Z","avatar_url":"https://github.com/tseemann.png","language":"Perl","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![CI](https://github.com/tseemann/snippy/actions/workflows/main.yml/badge.svg)](https://github.com/tseemann/snippy/actions/workflows/main.yml)\n[![License: GPL v2](https://img.shields.io/badge/License-GPL%20v2-blue.svg)](https://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html)\n![Don't judge me](https://img.shields.io/badge/Language-Perl_5-steelblue.svg)\n\n# Snippy\nRapid haploid variant calling and core genome alignment\n\n## Author\n[Torsten Seemann](https://twitter.com/torstenseemann)\n\n## Synopsis\n\nSnippy finds SNPs between a haploid reference genome and your NGS sequence\nreads.  It will find both substitutions (snps) and insertions/deletions\n(indels).  It will use as many CPUs as you can give it on a single computer\n(tested to 64 cores).  It is designed with speed in mind, and produces a\nconsistent set of output files in a single folder.  It can then take a set\nof Snippy results using the same reference and generate a core SNP alignment\n(and ultimately a phylogenomic tree).\n\n## Quick Start\n```\n% snippy --cpus 16 --outdir mysnps --ref Listeria.gbk --R1 FDA_R1.fastq.gz --R2 FDA_R2.fastq.gz\n\u003ccut\u003e\nWalltime used: 3 min, 42 sec\nResults folder: mysnps\nDone.\n\n% ls mysnps\nsnps.vcf snps.bed snps.gff snps.csv snps.tab snps.html \nsnps.bam snps.txt reference/ ...\n\n% head -5 mysnps/snps.tab\nCHROM  POS     TYPE    REF   ALT    EVIDENCE        FTYPE STRAND NT_POS AA_POS LOCUS_TAG GENE PRODUCT EFFECT\nchr      5958  snp     A     G      G:44 A:0        CDS   +      41/600 13/200 ECO_0001  dnaA replication protein DnaA missense_variant c.548A\u003eC p.Lys183Thr\nchr     35524  snp     G     T      T:73 G:1 C:1    tRNA  -   \nchr     45722  ins     ATT   ATTT   ATTT:43 ATT:1   CDS   -                    ECO_0045  gyrA DNA gyrase\nchr    100541  del     CAAA  CAA    CAA:38 CAAA:1   CDS   +                    ECO_0179      hypothetical protein\nplas      619  complex GATC  AATA   GATC:28 AATA:0  \nplas     3221  mnp     GA    CT     CT:39 CT:0      CDS   +                    ECO_p012  rep  hypothetical protein\n\n% snippy-core --prefix core mysnps1 mysnps2 mysnps3 mysnps4 \nLoaded 4 SNP tables.\nFound 2814 core SNPs from 96615 SNPs.\n\n% ls core.*\ncore.aln core.tab core.tab core.txt core.vcf\n```\n\n# Installation\n\n## Conda\nInstall [Bioconda](https://bioconda.github.io/user/install.html) then:\n```\nconda install -c conda-forge -c bioconda -c defaults snippy\n```\n\n## Homebrew\nInstall [Homebrew](http://brew.sh/) (MacOS)\nor [LinuxBrew](http://linuxbrew.sh/) (Linux) then:\n```\nbrew install brewsci/bio/snippy\n```\n\n## Source\nThis will install the latest version direct from Github. \nYou'll need to add Snippy's `bin` directory to your `$PATH`.\n```\ncd $HOME\ngit clone https://github.com/tseemann/snippy.git\n$HOME/snippy/bin/snippy --help\n```\n\n# Check installation\nEnsure you have the desired version:\n```\nsnippy --version\n```\nCheck that all dependencies are installed and working:\n```\nsnippy --check\n```\n\n# Calling SNPs\n\n## Input Requirements\n* a reference genome in FASTA or GENBANK format (can be in multiple contigs)\n* sequence read file(s) in FASTQ or FASTA format (can be .gz compressed) format\n* a folder to put the results in\n\n## Output Files\n\nExtension | Description\n----------|--------------\n.tab | A simple [tab-separated](http://en.wikipedia.org/wiki/Tab-separated_values) summary of all the variants\n.csv | A [comma-separated](http://en.wikipedia.org/wiki/Comma-separated_values) version of the .tab file\n.html | A [HTML](http://en.wikipedia.org/wiki/HTML) version of the .tab file\n.vcf | The final annotated variants in [VCF](http://en.wikipedia.org/wiki/Variant_Call_Format) format\n.bed | The variants in [BED](http://genome.ucsc.edu/FAQ/FAQformat.html#format1) format\n.gff | The variants in [GFF3](http://www.sequenceontology.org/gff3.shtml) format\n.bam | The alignments in [BAM](http://en.wikipedia.org/wiki/SAMtools) format. Includes unmapped, multimapping reads. Excludes duplicates.\n.bam.bai | Index for the .bam file\n.log | A log file with the commands run and their outputs\n.aligned.fa | A version of the reference but with `-` at position with `depth=0` and `N` for `0 \u003c depth \u003c --mincov` (**does not have variants**)\n.consensus.fa | A version of the reference genome with *all* variants instantiated\n.consensus.subs.fa | A version of the reference genome with *only substitution* variants instantiated\n.raw.vcf | The unfiltered variant calls from Freebayes\n.filt.vcf | The filtered variant calls from Freebayes\n.vcf.gz | Compressed .vcf file via [BGZIP](http://blastedbio.blogspot.com.au/2011/11/bgzf-blocked-bigger-better-gzip.html)\n.vcf.gz.csi | Index for the .vcf.gz via `bcftools index`)\n\n:warning: :x: Snippy 4.x does **NOT** produce the following files that Snippy 3.x did\n\nExtension | Description\n----------|--------------\n.vcf.gz.tbi | Index for the .vcf.gz via [TABIX](http://bioinformatics.oxfordjournals.org/content/27/5/718.full)\n.depth.gz | Output of `samtools depth -aa` for the `.bam` file\n.depth.gz.tbi | Index for the `.depth.gz` file\n\n## Columns in the TAB/CSV/HTML formats\n\nName | Description\n-----|------------\nCHROM | The sequence the variant was found in eg. the name after the ```\u003e``` in the FASTA reference\nPOS | Position in the sequence, counting from 1\nTYPE | The variant type: snp msp ins del complex\nREF | The nucleotide(s) in the reference\nALT | The alternate nucleotide(s) supported by the reads\nEVIDENCE | Frequency counts for REF and ALT\n\nIf you supply a Genbank file as the `--reference` rather than a FASTA\nfile, Snippy will fill in these extra columns by using the genome annotation\nto tell you which feature was affected by the variant:\n\nName | Description\n-----|------------\nFTYPE | Class of feature affected: CDS tRNA rRNA ...\nSTRAND | Strand the feature was on: + - .\nNT_POS | Nucleotide position of the variant withinthe feature / Length in nt\nAA_POS | Residue position / Length in aa (only if FTYPE is CDS)\nLOCUS_TAG | The `/locus_tag` of the feature (if it existed)\nGENE | The `/gene` tag of the feature (if it existed)\nPRODUCT | The `/product` tag of the feature (if it existed)\nEFFECT | The `snpEff` annotated consequence of this variant (ANN tag in .vcf)\n\n## Columns in TXT format\n\nName | Description\n-----|------------\nID | Reference + Sample\nLENGTH | Length of the reference\nALIGNED | Number of sites aligned to\nUNALIGNED | Number of sites unaligned\nVARIANT | Number of sites different from the reference\nHET | Number of sites heterozygous or poor quality genotype represented with an n (`--minqual`)\nMASKED | Number of sites masked in reference represented with an X (`--mask`)\nLOWCOV | Number of sites low coverage in this sample represented with an N (`--mincov`)\n\n## Variant Types\n\nType | Name | Example\n-----|------|-------------\nsnp  | Single Nucleotide Polymorphism |  A =\u003e T\nmnp  | Multiple Nuclotide Polymorphism | GC =\u003e AT\nins  | Insertion | ATT =\u003e AGTT\ndel  | Deletion | ACGG =\u003e ACG\ncomplex | Combination of snp/mnp | ATTC =\u003e GTTA\n\n## The variant caller\n\nThe variant calling is done by\n[Freebayes](https://github.com/ekg/freebayes).\nThe key parameters under user control are:\n\n* `--mincov` - the minimum number of reads covering a site to be considered (default=10)\n* `--minfrac` - the minimum proportion of those reads which must differ from the reference\n* `--minqual` - the minimum VCF variant call \"quality\" (default=100)\n\n## Looking at variants in detail with `snippy-vcf_report`\n\nIf you run Snippy with the `--report` option it will automatically run\n`snippy-vcf_report` and generate a `snps.report.txt` which has a section\nlike this for each SNP in `snps.vcf`:\n```\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\u003eLBB_contig000001:10332 snp A=\u003eT DP=7 Q=66.3052 [7]\n\n         10301     10311     10321     10331     10341     10351     10361\ntcttctccgagaagggaatataatttaaaaaaattcttaaataattcccttccctcccgttataaaaattcttcgcttat\n........................................T.......................................\n,,,,,,  ,,,,,,,,,,,,,,,,,,,,,t,,,,,,,,,,t,,t,,,,,,,,,,,,,,,,g,,,,,,,g,,,,,,,,,t,\n,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, .......T..................A............A.......\n.........................A........A.....T...........    .........C..............\n.....A.....................C..C........CT.................TA.............\n,a,,,,,a,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,t,t,,,g,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,\n,,,,,ga,,,,,,,c,,,,,,,t,,,,,,,,,,g,,,,,,t,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,\n                            ............T.C..............G...............G......\n                                                    ,,,,,,,g,,,,,,,,g,,,,,,,,,,,\n                                                           g,,,,,,,,,,,,,,,,,,,,\n```\n\nIf you wish to generate this report *after* you have run Snippy, you can\nrun it directly:\n```\ncd snippydir\nsnippy-vcf_report --cpus 8 --auto \u003e snps.report.txt\n```\nIf you want a HTML version for viewing in a web browser, use the `--html` option:\n```\ncd snippydir\nsnippy-vcf_report --html --cpus 16 --auto \u003e snps.report.html\n```\nIt works by running `samtools tview` for each variant, which can be very slow\nif you have 1000s of variants. Using `--cpus` as high as possible is recommended.\n\n## Options\n\n* `--rgid` will set the Read Group (`RG`) ID (`ID`) and Sample (`SM`) in the BAM and VCF file.\nIf not supplied, it will will use the `--outdir` folder name for both `ID` and `SM`.\n\n* `--mapqual` is the minimum mapping quality to accept in variant calling. BWA MEM using `60`\nto mean a read is \"uniquely mapped\". \n\n* `--basequal` is minimum quality a nucleotide needs to be used in variant calling. We use\n`13` which corresponds to error probability of ~5%. It is a traditional SAMtools value.\n\n* `--maxsoft` is how many bases of an alignment to allow to be soft-clipped before discarding\nthe alignment. This is to encourage global over local alignment, and is passed to the\n`samclip` tool.\n\n* `--mincov` and `--minfrac` are used to apply hard thresholds to the variant calling\nbeyond the existing statistical measure.. The optimal values depend on your sequencing\ndepth and contamination rate. Values of 10 and 0.9 are commonly used.\n\n* `--targets` takes a BED file and only calls variants in those regions. Not normally needed\nunless you are only interested in variants in specific locii (eg. AMR genes) but are still\nperforming WGS rather than amplicon sequencing.\n\n* `--contigs` allows you to call SNPs from contigs rather than reads. It shreds the contigs\ninto synthetic reads, as to put the calls on even footing with other read samples in a \nmulti-sample analysis.\n\n\n# Core SNP phylogeny\n\nIf you call SNPs for multiple isolates from the same reference, you can\nproduce an alignment of \"core SNPs\" which can be used to build a\nhigh-resolution phylogeny (ignoring possible recombination).  A \"core site\"\nis a genomic position that is present in _all_ the samples.  A core site can\nhave the same nucleotide in every sample (\"monomorphic\") or some samples can\nbe different (\"polymorphic\" or \"variant\").  If we ignore the complications\nof \"ins\", \"del\" variant types, and just use variant sites, these are the \"core SNP genome\".\n\n## Input Requirements\n* a set of Snippy folders which used the same `--ref` sequence.\n\n### Using `snippy-multi`\n\nTo simplify running a set of isolate sequences (reads or contigs)\nagainst the same reference, you can use the `snippy-multi` script.\nThis script requires a *tab separated* input file as follows, and\ncan handle paired-end reads, single-end reads, and assembled contigs.\n```\n# input.tab = ID R1 [R2]\nIsolate1\t/path/to/R1.fq.gz\t/path/to/R2.fq.gz\nIsolate1b\t/path/to/R1.fastq.gz\t/path/to/R2.fastq.gz\nIsolate1c\t/path/to/R1.fa\t\t/path/to/R2.fa\n# single end reads supported too\nIsolate2\t/path/to/SE.fq.gz\nIsolate2b\t/path/to/iontorrent.fastq\n# or already assembled contigs if you don't have reads\nIsolate3\t/path/to/contigs.fa\nIsolate3b\t/path/to/reference.fna.gz\n```\nThen one would run this to generate the output script.\nThe first parameter should be the `input.tab` file.\nThe remaining parameters should be any remaining\nshared `snippy` parameters. The `ID` will be used for\neach isolate's `--outdir`.\n```\n% snippy-multi input.tab --ref Reference.gbk --cpus 16 \u003e runme.sh\n\n% less runme.sh   # check the script makes sense\n\n% sh ./runme.sh   # leave it running over lunch\n```\nIt will also run `snippy-core` at the end to generate the\ncore genome SNP alignment files `core.*`.\n\n## Output Files\n\nExtension | Description\n----------|--------------\n.aln | A core SNP alignment in the `--aformat` format (default FASTA)\n.full.aln | A whole genome SNP alignment (includes invariant sites)\n.tab | Tab-separated columnar list of **core** SNP sites with alleles but NO annotations\n.vcf | Multi-sample VCF file with genotype `GT` tags for all discovered alleles\n.txt | Tab-separated columnar list of alignment/core-size statistics\n.ref.fa | FASTA version/copy of the `--ref`\n.self_mask.bed | BED file generated if `--mask auto` is used.\n\n## Why is `core.full.aln` an alphabet soup?\n\nThe `core.full.aln` file is a FASTA formatted mutliple sequence alignment file.\nIt has one sequence for the reference, and one for each sample participating in\nthe core genome calculation.  Each sequence has the same length as the reference\nsequence.\n\nCharacter | Meaning\n----------|-----------\n`ATGC`    | Same as the reference\n`atgc`    | Different from the reference\n`-`       | Zero coverage in this sample **or** a deletion relative to the reference\n`N`       | Low coverage in this sample (based on `--mincov`)\n`X`       | Masked region of reference (from `--mask`)\n`n`       | Heterozygous or poor quality genotype  (has `GT=0/1` or `QUAL \u003c --minqual` in `snps.raw.vcf`)\n\nYou can remove all the \"weird\" characters and replace them with `N` using the included\n`snippy-clean_full_aln`.  This is useful when you need to pass it to a tree-building\nor recombination-removal tool:\n\n```\n% snippy-clean_full_aln core.full.aln \u003e clean.full.aln\n% run_gubbins.py -p gubbins clean.full.aln\n% snp-sites -c gubbins.filtered_polymorphic_sites.fasta \u003e clean.core.aln\n% FastTree -gtr -nt clean.core.aln \u003e clean.core.tree\n```\n\n## Options\n\n* If you want to mask certain regions of the genome, you can provide a BED file\n  with the `--mask` parameter. Any SNPs in those regions will be excluded. This\n  is common for genomes like *M.tuberculosis* where pesky repetitive PE/PPE/PGRS\n  genes cause false positives, or masking phage regions. A `--mask` bed file\n  for *M.tb* is provided with Snippy in the `etc/Mtb_NC_000962.3_mask.bed`\n  folder. It is derived from the XLSX file from https://gph.niid.go.jp/tgs-tb/\n* If you use the `snippy --cleanup` option the reference files will be deleted.\n  This means `snippy-core` can not \"auto-find\" the reference. In this case you\n  simply use `snippy-core --reference REF` to provide the reference in FASTA format.\n\n# Advanced usage\n\n## Increasing speed when too many reads\n\nSometimes you will have far more sequencing depth that you need to call SNPs.\nA common problem is a whole MiSeq flowcell for a single bacterial isolate,\nwhere 25 million reads results in genome depth as high as 2000x. This makes\nSnippy far slower than it needs to be, as most SNPs will be recovered with\n50-100x depth. If you know you have 10 times as much data as you need,\nSnippy can randomly sub-sample your FASTQ data:\n```\n# have 1000x depth, only need 100x so sample at 10%\nsnippy --subsample 0.1  ...\n\u003csnip\u003e\nSub-sampling reads at rate 0.1\n\u003csnip\u003e\n```\n\n## Only calling SNPs in particular regions\n\nIf you are looking for specific SNPs, say AMR releated ones in particular genes\nin your reference genome, you can save much time by only calling variants there.\nJust put the regions of interest into a BED file:\n```\nsnippy --targets sites.bed ...\n```\n\n## Finding SNPs between contigs\n\nSometimes one of your samples is only available as contigs, without\ncorresponding FASTQ reads. You can still use these contigs with Snippy\nto find variants against a reference. It does this by shredding the contigs\ninto 250 bp single-end reads at `2 \u0026times; --mincov` uniform coverage.\n\nTo use this feature, instead of providing `--R1` and `--R2` you use the\n`--ctgs` option with the contigs file:\n\n```\n% ls\nref.gbk mutant.fasta\n\n% snippy --outdir mut1 --ref ref.gbk --ctgs mut1.fasta\nShredding mut1.fasta into pseudo-reads.\nIdentified 257 variants.\n\n% snippy --outdir mut2 --ref ref.gbk --ctgs mut2.fasta\nShredding mut2.fasta into pseudo-reads.\nIdentified 413 variants.\n\n% snippy-core mut1 mut2 \nFound 129 core SNPs from 541 variant sites.\n\n% ls\ncore.aln core.full.aln ...\n```\n\nThis output folder is completely compatible with `snippy-core` so you can\nmix FASTQ and contig based `snippy` output folders to produce alignments.\n\n## Correcting assembly errors\n\nThe _de novo_ assembly process attempts to reconstruct the reads into the original \nDNA sequences they were derived from. These reconstructed sequences are called \n_contigs_ or _scaffolds_. For various reasons, small errors can be introduced into\nthe assembled contigs which are not supported by the original reads used in the \nassembly process.\n\nA common strategy is to align the reads back to the contigs to check for discrepancies.\nThese errors appear as variants (SNPs and indels). If we can _reverse_ these variants\nthan we can \"correct\" the contigs to match the evidence provided by the original reads.\nObviously this strategy can go wrong if one is not careful about _how_ the read alignment\nis performed and which variants are accepted.\n\nSnippy is able to help with this contig correction process. In fact, it produces a\n`snps.consensus.fa` FASTA file which is the `ref.fa` input file provided but with the\ndiscovered variants in `snps.vcf` applied! \n\nHowever, Snippy is not perfect and sometimes finds questionable variants. Typically\nyou would make a copy of `snps.vcf` (let's call it `corrections.vcf`) and remove those\nlines corresponding to variants we don't trust. For example, when correcting Roche 454\nand PacBio SMRT contigs, we primarily expect to find homopolymer errors and hence\nexpect to see `ins` more than `snp` type variants. \n\nIn this case you need to run the correcting process manually using these steps:\n\n```\n% cd snippy-outdir\n% cp snps.vcf corrections.vcf\n% $EDITOR corrections.vcf\n% bgzip -c corrections.vcf \u003e corrections.vcf.gz\n% tabix -p vcf corrections.vcf.gz\n% vcf-consensus corrections.vcf.gz \u003c ref.fa \u003e corrected.fa\n```\n\nYou may wish to _iterate_ this process by using `corrected.fa` as a new `--ref` for\na repeated run of Snippy. Sometimes correcting one error allows BWA to align things\nit couldn't before, and new errors are uncovered.\n\nSnippy may not be the best way to correct assemblies - you should consider\ndedicated tools such as [PILON](http://www.broadinstitute.org/software/pilon/) \nor [iCorn2](http://icorn.sourceforge.net/), or adjust the \nQuiver parameters (for Pacbio data).\n\n## Unmapped Reads\n\nSometimes you are interested in the reads which did *not* align to the reference genome.\nThese reads represent DNA that was novel to *your* sample which is potentially interesting.\nA standard strategy is to *de novo* assemble the unmapped reads to discover these novel\nDNA elements, which often comprise mobile genetic elements such as plasmids.\n\nBy default, Snippy does **not** keep the unmapped reads, not even in the BAM file.\nIf you wish to keep them, use the `--unmapped` option and the unaligned reads will\nbe saved to a compressed FASTQ file:\n\n```\n% snippy --outdir out --unmapped ....\n\n% ls out/\nsnps.unmapped.fastq.gz ....\n```\n\n# Information\n\n## Etymology\n\nThe name Snippy is a combination of\n[SNP](http://en.wikipedia.org/wiki/Single-nucleotide_polymorphism)\n(pronounced \"snip\") , [snappy](http://www.thefreedictionary.com/snappy)\n(meaning \"quick\") and [Skippy the Bush\nKangaroo](http://en.wikipedia.org/wiki/Skippy_the_Bush_Kangaroo) (to\nrepresent its Australian origin)\n\n## License\n\nSnippy is free software, released under the \n[GPL (version 2)](https://raw.githubusercontent.com/tseemann/snippy/master/LICENSE).\n\n## Issues\n\nPlease submit suggestions and bug reports to the \n[Issue Tracker](https://github.com/tseemann/snippy/issues)\n\n## Requirements\n\n* perl \u003e= 5.18\n* bioperl \u003e= 1.7\n* bwa mem \u003e= 0.7.12 \n* minimap2 \u003e= 2.0\n* samtools \u003e= 1.7\n* bcftools \u003e= 1.7\n* bedtools \u003e= 2.0\n* GNU parallel \u003e= 2013xxxx\n* freebayes \u003e= 1.1 (freebayes, freebayes-parallel, fasta_generate_regions.py)\n* vcflib \u003e= 1.0 (vcfstreamsort, vcfuniq, vcffirstheader)\n* [vt](https://genome.sph.umich.edu/wiki/Vt) \u003e= 0.5\n* snpEff \u003e= 4.3\n* samclip \u003e= 0.2\n* seqtk \u003e= 1.2\n* snp-sites \u003e= 2.0\n* any2fasta \u003e= 0.4\n* wgsim \u003e= 1.8 (for testing only - `wgsim` command)\n\n## Bundled binaries\n\nFor Linux (compiled on Ubuntu 16.04 LTS) and macOS (compiled on High Sierra Brew) \nsome of the binaries, JARs and scripts are included.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftseemann%2Fsnippy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftseemann%2Fsnippy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftseemann%2Fsnippy/lists"}