{"id":13756360,"url":"https://github.com/at-cg/PanAligner","last_synced_at":"2025-05-10T03:31:25.729Z","repository":{"id":165297427,"uuid":"594001085","full_name":"at-cg/PanAligner","owner":"at-cg","description":"Long read aligner for cyclic and acyclic pangenome graphs","archived":false,"fork":false,"pushed_at":"2023-12-20T14:03:47.000Z","size":3975,"stargazers_count":33,"open_issues_count":0,"forks_count":4,"subscribers_count":4,"default_branch":"main","last_synced_at":"2024-02-12T15:20:55.003Z","etag":null,"topics":["pangenome","read-alignment","variation-graphs"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/at-cg.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2023-01-27T11:07:49.000Z","updated_at":"2023-12-12T16:26:21.000Z","dependencies_parsed_at":"2024-01-13T03:00:55.499Z","dependency_job_id":"af81a308-6d0d-48bb-a9ae-d9ad0ca9fc2e","html_url":"https://github.com/at-cg/PanAligner","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/at-cg%2FPanAligner","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/at-cg%2FPanAligner/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/at-cg%2FPanAligner/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/at-cg%2FPanAligner/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/at-cg","download_url":"https://codeload.github.com/at-cg/PanAligner/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253358069,"owners_count":21895967,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["pangenome","read-alignment","variation-graphs"],"created_at":"2024-08-03T11:00:42.918Z","updated_at":"2025-05-10T03:31:25.260Z","avatar_url":"https://github.com/at-cg.png","language":"C","funding_links":[],"categories":["A list of software capable of analyzing mainly **eukaryotic** genomes for pangenomics."],"sub_categories":[],"readme":"# \u003ca name=\"started\"\u003e\u003c/a\u003eGetting Started\n\n```sh\ngit clone https://github.com/at-cg/PanAligner\ncd PanAligner \u0026\u0026 make\n# Map sequence to graph\n./PanAligner -cx lr test/MT.gfa test/MT-orangA.fa \u003e out.gaf\n```\n\n## Table of Contents\n\n- [Getting Started](#started)\n- [Introduction](#intro)\n- [Users' Guide](#uguide)\n- [Installation](#install)\n  - [Sequence mapping](#map)\n  - [Hybrid method](#hybrid)\n  - [Benchmark](#bench)\n- [Citation](#cite)\n\n## \u003ca name=\"intro\"\u003e\u003c/a\u003eIntroduction\n\nPanAligner is an efficient tool to align long-reads or assembly contigs to  a cyclic pangenome graph. We follow the seed-chain-extend procedure. We provide the first exact implementation of co-linear chaining technique which is generalized to cyclic graphs. The details of the formulation and the algorithm are provided in our paper. If the input graph is a DAG, PanAligner works similarly as [minichain](https://github.com/at-cg/minichain.git). We benefit from open-source code from [minichain](https://github.com/at-cg/minichain.git), [minigraph](https://github.com/lh3/minigraph.git), and [GraphChainer](https://github.com/algbio/GraphChainer.git) for other necessary components besides co-linear chaining. PanAligner can scale to human pangenome graphs and whole-genome sequencing read sets.\n\n## \u003ca name=\"uguide\"\u003e\u003c/a\u003eUsers' Guide\n\n### \u003ca name=\"install\"\u003e\u003c/a\u003eInstallation\nTo install PanAligner, type `make` in the source code directory.\n\n#### Dependencies\n1) [gcc9][gcc9] or later version\n2) [zlib][zlib]\n\n\n### \u003ca name=\"map\"\u003e\u003c/a\u003eSequence mapping\nPanAligner can be used for both sequence-to-sequence alignment and sequence-to-graph mapping. For sequence-to-sequence alignment, PanAligner maps a read to a reference in fasta format and provide read mapping output in [PAF][paf] format. For sequence-to-graph mapping, PanAligner takes the graph in [GFA][gfa1] and [rGFA][rGFA] format as input, and provides read mapping in [GAF][gaf] format.\n\n```sh\n# Map sequence to sequence\n./PanAligner -cx lr test/MT-human.fa test/MT-orangA.fa \u003e out.paf\n# Map sequence to graph\n./PanAligner -cx lr test/MT.gfa test/MT-orangA.fa \u003e out.gaf\n```\n\n\n### \u003ca name=\"hybrid\"\u003e\u003c/a\u003eHybrid method\nThe Hybrid method leverages the strengths of both [minigraph](https://github.com/lh3/minigraph.git) and PanAligner to achieve efficient and accurate sequence-to-graph mapping. This method is designed to identify a subset of reads that are relatively \"easy-to-align\" and utilizes the fast [minigraph](https://github.com/lh3/minigraph.git) heuristics for aligning them. For the remaining reads, PanAligner is used for the alignment. \n\nBefore running the HybridMethod.sh, ensure that [conda](https://docs.conda.io/en/latest/) is installed and available in your PATH.\n```sh\n# One time installation of dependencies\nchmod +x get_dependencies.sh\n./get_dependencies.sh\n\n# create a hybrid_test folder\nmkdir hybrid_test\ncp hybrid_method.sh hybrid_test\ncd hybrid_test\n\n# Map a sequence using the hybrid method in the hybrid_test folder\n./hybrid_method.sh ../test/MT.gfa ../test/MT-human.fa out.gaf 4\n # Here hybrid_method.sh takes 1st argument as graph file 2nd argument as query file 3rd argument as output \"gaf\" file and last argument specifies the count of threads\n```\n## \u003ca name=\"bench\"\u003e\u003c/a\u003eBenchmark\n\nWe evaluated PanAligner and Hybrid method against other sequence-to-graph aligners to assess its scalability and accuracy advantages. The evaluation utilized human pangenome graphs constructed from [94 high-quality haplotype assemblies](https://github.com/human-pangenomics/HPP_Year1_Assemblies) provided by the Human Pangenome Reference Consortium, along with the [CHM13 human genome assembly](https://www.ncbi.nlm.nih.gov/assembly/GCA_009914755.4) from the Telomere-to-Telomere consortium. Simulated long-reads with 0.5× coverage and 5% error-rate were used for the experiments, employing cyclic graphs of sizes 10H, 40H, and 95H, where the prefix integer represents the haplotype count in each graph. The results demonstrated superior read mapping precision, as shown in the [figure](#Plot). Notably, even on the largest graph with 95 haplotypes, PanAligner achieved efficient performance, requiring 2 hours and 36 minutes, 44 GB RAM, and 32 threads on [perlmutter CPU nodes](https://docs.nersc.gov/systems/perlmutter/architecture/#cpu-nodes).\n\n\n\u003cp align=\"center\" id=\"Plot\"\u003e\n  \u003ca href=\"./data/plot.png\"\u003e\n    \u003cimg src=\"./data/plot.png\" width=\"750\" alt=\"Plot\"\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n\n## \u003ca name=\"cite\"\u003e\u003c/a\u003eCitation\nJyotshna Rajput, Ghanshyam Chandra, Chirag Jain. [Co-linear Chaining on Pangenome Graphs](https://www.biorxiv.org/content/10.1101/2023.06.21.545871v1). WABI 2023\n\n[gwfa]: https://arxiv.org/abs/2206.13574\n[paper_1]: https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-02168-z\n[paper_2]: https://www.biorxiv.org/content/10.1101/2022.08.29.505691v2\n[minichain]: https://github.com/at-cg/minichain\n[zlib]: http://zlib.net/\n[gcc9]: http://zlib.net/\n[rgfa]: https://github.com/lh3/gfatools/blob/master/doc/rGFA.md\n[gfa1]: https://github.com/GFA-spec/GFA-spec/blob/master/GFA1.md\n[gaf]: https://github.com/lh3/gfatools/blob/master/doc/rGFA.md#the-graph-alignment-format-gaf\n[paf]: https://github.com/lh3/miniasm/blob/master/PAF.md\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fat-cg%2FPanAligner","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fat-cg%2FPanAligner","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fat-cg%2FPanAligner/lists"}