{"id":13756350,"url":"https://github.com/algbio/GraphChainer","last_synced_at":"2025-05-10T03:31:23.333Z","repository":{"id":39997827,"uuid":"265509194","full_name":"algbio/GraphChainer","owner":"algbio","description":"An accurate aligner of long reads to a variation graph, based on co-linear chaining","archived":false,"fork":false,"pushed_at":"2023-03-13T16:15:17.000Z","size":312,"stargazers_count":25,"open_issues_count":2,"forks_count":3,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-02-12T15:20:46.343Z","etag":null,"topics":["long-reads","pangenomics","read-aligners","variation-graph"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/algbio.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2020-05-20T09:03:43.000Z","updated_at":"2023-10-23T15:33:15.000Z","dependencies_parsed_at":"2024-01-13T03:00:56.770Z","dependency_job_id":"91659ca9-6e39-426d-99fb-6bbf3b18d521","html_url":"https://github.com/algbio/GraphChainer","commit_stats":null,"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/algbio%2FGraphChainer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/algbio%2FGraphChainer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/algbio%2FGraphChainer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/algbio%2FGraphChainer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/algbio","download_url":"https://codeload.github.com/algbio/GraphChainer/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253358069,"owners_count":21895967,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["long-reads","pangenomics","read-aligners","variation-graph"],"created_at":"2024-08-03T11:00:42.775Z","updated_at":"2025-05-10T03:31:23.004Z","avatar_url":"https://github.com/algbio.png","language":"C++","funding_links":[],"categories":["A list of software capable of analyzing mainly **eukaryotic** genomes for pangenomics."],"sub_categories":[],"readme":"# GraphChainer\n\nGraphChainer is an accurate aligner of long reads to a variation graph, based on co-linear chaining.\n\n### Compiling\n\nTo compile, run these:\n\n- Install [miniconda](https://conda.io/projects/conda/en/latest/user-guide/install/index.html)\n- `git submodule update --init --recursive`\n- `conda env create -f CondaEnvironment.yml`\n- `conda activate GraphChainer`\n- `make bin/GraphChainer`\n\n### Running\n\nQuickstart: `./bin/GraphChainer -t 4 -f reads.fastq -g graph.gfa -a out.gam`\n\nKey parameters:\n- `-t` Number of threads (optional, default 1).\n- `-f` Input reads. Format .fasta / .fastq / .fasta.gz / .fastq.gz. You can input multiple files with `-f file1 -f file2 ...` or `-f file1 file2 ...`.\n- `-g` Input graph, format .gfa / .vg. **This graph must be acyclic**, see below how to construct an acyclic graph with vg.\n- `-a` Output file name. Format .gam or .json.\n\nParameters related to colinear chaining:\n- `--sampling-step \u003cdouble\u003e` Sampling step factor (default 1). Use \u003e1 (\u003c1, \u003e0) for faster (slower), but less (more) accurate alignments. It increases (decreases) the sampling sparsity of fragments.\n- `--colinear-split-len \u003cint\u003e` The length of the fragments in which the long read is split to create anchors (default 35).\n- `--colinear-split-gap \u003cint\u003e` The distance between consecutive fragments (default 35). If `--sampling-step` is set, then always `--colinear-split-gap = ceil(--sampling-step * --colinear-split-len`).\n- `--colinear-gap \u003cint\u003e` When converting an optimal chain of anchors into an alignment path, split the path if the distance in the graph between consecutive anchors is greater than this value (default 10000).\n\n### Constructing an (acyclic) variation graph\n\nUse [vg](https://github.com/vgteam/vg) and run:\n\n`vg construct -t 30 -a -r {ref} -v {vcf} -R 22 -p -m 3000000`\n\n### Datasets availability\n\nThe graphs built for the experiments of GraphChainer can be found in Zenodo at [https://doi.org/10.5281/zenodo.7729494\n](https://doi.org/10.5281/zenodo.7729494\n), [https://doi.org/10.5281/zenodo.6875064](https://doi.org/10.5281/zenodo.6875064) and at [https://doi.org/10.5281/zenodo.6587252](https://doi.org/10.5281/zenodo.6587252)\n\nThe real read sets can be found in Zenodo ar [TODO](TODO)\n\nThe evaluation pipeline used in the paper can be found at [https://github.com/algbio/GraphChainer-scripts](https://github.com/algbio/GraphChainer-scripts)\n\n### Citation\n\nIf you use GraphChainer, please cite as:\n\nJun Ma, Manuel Cáceres, Leena Salmela, Veli Mäkinen, Alexandru I. Tomescu. Chaining for Accurate Alignment of Erroneous Long Reads to Acyclic Variation Graphs. Submitted, 2022\n\n### Credits\n\nGraphChainer is built on the excellent code base of [GraphAligner](https://github.com/maickrau/GraphAligner), which is released under [MIT License](https://github.com/maickrau/GraphAligner/blob/master/LICENSE.md). GraphAligner is described in the paper [GraphAligner: Rapid and Versatile Sequence-to-Graph Alignment](https://doi.org/10.1186/s13059-020-02157-2) by Mikko Rautiainen and Tobias Marschall.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falgbio%2FGraphChainer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falgbio%2FGraphChainer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falgbio%2FGraphChainer/lists"}