{"id":13756346,"url":"https://github.com/DaehwanKimLab/hisat2","last_synced_at":"2025-05-10T03:31:37.446Z","repository":{"id":41112785,"uuid":"37680726","full_name":"DaehwanKimLab/hisat2","owner":"DaehwanKimLab","description":"Graph-based alignment (Hierarchical Graph FM index)","archived":false,"fork":false,"pushed_at":"2023-11-30T17:38:18.000Z","size":33275,"stargazers_count":446,"open_issues_count":214,"forks_count":112,"subscribers_count":39,"default_branch":"master","last_synced_at":"2024-02-12T15:20:45.723Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DaehwanKimLab.png","metadata":{"files":{"readme":"README.md","changelog":"NEWS","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS","dei":null,"publiccode":null,"codemeta":null}},"created_at":"2015-06-18T19:40:34.000Z","updated_at":"2024-08-03T11:02:05.494Z","dependencies_parsed_at":"2022-07-12T18:17:34.415Z","dependency_job_id":"fc866fcd-a468-4c91-84bd-cb7d60fe3ecb","html_url":"https://github.com/DaehwanKimLab/hisat2","commit_stats":{"total_commits":1319,"total_committers":27,"mean_commits":"48.851851851851855","dds":"0.26762699014404856","last_synced_commit":"50869389923e5fa2ba8cf1e36f711c85a7889ccd"},"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DaehwanKimLab%2Fhisat2","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DaehwanKimLab%2Fhisat2/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DaehwanKimLab%2Fhisat2/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DaehwanKimLab%2Fhisat2/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DaehwanKimLab","download_url":"https://codeload.github.com/DaehwanKimLab/hisat2/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224911465,"owners_count":17390840,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-03T11:00:42.652Z","updated_at":"2024-11-16T11:31:32.166Z","avatar_url":"https://github.com/DaehwanKimLab.png","language":"C++","readme":"# HISAT2\n\n## Contact\n\n[Daehwan Kim](https://kim-lab.org) (infphilo@gmail.com) and [Chanhee Park](https://www.linkedin.com/in/chanhee-park-97677297/) (parkchanhee@gmail.com)\n\n## Overview\nHISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (whole-genome,\ntranscriptome, and exome sequencing data) to a population of human genomes (as well as to a single reference genome).\nBased on an extension of BWT for a graph [1], we designed and implemented a graph FM index (GFM), an original approach\nand its first implementation to the best of our knowledge. In addition to using one global GFM index that represents\ngeneral population, HISAT2 uses a large set of small GFM indexes that collectively cover the whole genome (each index\nrepresenting a genomic region of 56 Kbp, with 55,000 indexes needed to cover human population). These small indexes\n(called local indexes) combined with several alignment strategies enable effective alignment of sequencing reads.\nThis new indexing scheme is called Hierarchical Graph FM index (HGFM). We have developed HISAT2 based on the HISAT\n[2] and Bowtie 2 [3] implementations.  See the [HISAT2 website](https://daehwankimlab.github.io/hisat2/) for\nmore information.\n\n\n![](HISAT2.png)\n\nGetting started\n============\nHISAT2 requires a 64-bit computer running either Linux or Mac OS X and at least 8 GB of RAM.\n\nA few notes:\n1) HISAT2's index (HGFM) size for the human reference genome and 12.3 million common SNPs is 6.2 GB. The SNPs consist of 11 million single nucleotide polymorphisms, 728,000 deletions, and 555,000 insertions. Insertions and deletions used in this index are small (usually \u003c20bp). We plan to incorporate structural variations (SV) into this index.\n2) The memory footprint of HISAT2 is relatively low, 6.7 GB.\n3) The runtime of HISAT2 is estimated to be slightly slower than HISAT (30–100% slower for some data sets).\n4) HISAT2 provides greater accuracy for alignment of reads containing SNPs.\n5) Use [HISAT-3N](https://daehwankimlab.github.io/hisat2/hisat-3n/) to align nucleotide converted sequencing reads\n   include [BS-seq], [SLAM-seq], [scBS-seq], [scSLAM-seq], [TAB-seq], [oxBS-seq], [TAPS] and [EM-seq].\n   This alignment process requires about 10 GB of RAM.\n6) HISAT2 repository is seperated with HISAT-genotype repository.\n   Please see the link below for [HISAT-genotype repository](https://github.com/DaehwanKimLab/hisat-genotype)\n   and [HISAT-genotype homepage](https://daehwankimlab.github.io/hisat-genotype/).\n\n## Install\n    git clone https://github.com/DaehwanKimLab/hisat2.git\n    cd hisat2\n    make\n\nUsage\n============\n## Building an index\n`hisat2-build` builds a HISAT2 index from a set of DNA sequences. `hisat2-build` outputs a set of 6 files with\nsuffixes `.1.ht2`, `.2.ht2`, `.3.ht2`, `.4.ht2`, `.5.ht2`, `.6.ht2`, .`7.ht2`, and `.8.ht2`.\nIn the case of a large index these suffixes will have a `ht2l` termination.\nThese files together constitute the index: they are all that is needed to align reads to that reference.\nThe original sequence FASTA files are no longer used by HISAT2 once the index is built.\n\nExample for HISAT2 index building:\n\n    hisat2-build genome.fa genome\n\n## Alignment with HISAT2\n\nExamples alignment with HISAT2:\n\n    # for single-end FASTA reads DNA alignment\n    hisat2 -f -x genome -U reads.fa -S output.sam --no-spliced-alignment\n\n    # for paired-end FASTQ reads alignment\n    hisat2 -x genome -1 reads_1.fq -2 read2_2.fq -S output.sam\n\n\n## For more information, see the following websites:\n* [HISAT2 website](https://daehwankimlab.github.io/hisat2)\n* [HISAT-3N website](https://daehwankimlab.github.io/hisat2/hisat-3n/)\n\n## License\n\n[GPL-3.0](LICENSE)\n\n## References\n\n[1] Sirén J, Välimäki N, Mäkinen V (2014) Indexing graphs for path queries with applications in genome research. IEEE/ACM Transactions on Computational Biology and Bioinformatics 11: 375–388. doi: 10.1109/tcbb.2013.2297101\n\n[2] Kim D, Langmead B, and Salzberg SL  HISAT: a fast spliced aligner with low memory requirements, Nature methods, 2015\n\n[3] Langmead B, Salzberg SL: Fast gapped-read alignment with Bowtie 2. Nat Methods 2012, 9:357-359\n\n\n[HISAT2]:https://github.com/DaehwanKimLab/hisat2\n[BS-seq]: https://en.wikipedia.org/wiki/Bisulfite_sequencing\n[SLAM-seq]: https://www.nature.com/articles/nmeth.4435\n[scBS-seq]: https://www.nature.com/articles/nmeth.3035\n[scSLAM-seq]: https://www.nature.com/articles/s41586-019-1369-y\n[TAPS]: https://www.nature.com/articles/s41587-019-0041-2\n[oxBS-seq]: https://science.sciencemag.org/content/336/6083/934\n[TAB-seq]: https://www.cell.com/fulltext/S0092-8674%2812%2900534-X\n[EM-seq]: https://genome.cshlp.org/cgi/content/long/gr.266551.120\n\nPublication\n============\n* ### HISAT2 and HISAT-genotype\n  Kim, D., Paggi, J.M., Park, C. et al. [Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype](https://www.nature.com/articles/s41587-019-0201-4). Nat Biotechnol 37, 907–915 (2019)\n\n* ### HISAT-3N\n  Zhang, Y., Park, C., Bennett, C., Thornton, M. and Kim, D [Rapid and accurate alignment of nucleotide conversion sequencing reads with HISAT-3N](https://doi.org/10.1101/gr.275193.120) Genome Research 31(7): 1290-1295 (2021)\n  \n\n","funding_links":[],"categories":["A list of software capable of analyzing mainly **eukaryotic** genomes for pangenomics."],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FDaehwanKimLab%2Fhisat2","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FDaehwanKimLab%2Fhisat2","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FDaehwanKimLab%2Fhisat2/lists"}