{"id":19154605,"url":"https://github.com/bcgsc/ntlink","last_synced_at":"2026-02-26T20:41:26.487Z","repository":{"id":38048709,"uuid":"294810237","full_name":"bcgsc/ntLink","owner":"bcgsc","description":"Minimizer-based assembly scaffolding and mapping using long reads","archived":false,"fork":false,"pushed_at":"2024-10-11T23:21:24.000Z","size":12883,"stargazers_count":38,"open_issues_count":0,"forks_count":7,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-04-19T19:35:00.014Z","etag":null,"topics":["assembly","bioinformatics","long-reads","minimizers","scaffolding"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bcgsc.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2020-09-11T20:54:46.000Z","updated_at":"2025-04-02T14:39:09.000Z","dependencies_parsed_at":"2024-11-09T08:38:40.201Z","dependency_job_id":null,"html_url":"https://github.com/bcgsc/ntLink","commit_stats":null,"previous_names":[],"tags_count":20,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bcgsc%2FntLink","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bcgsc%2FntLink/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bcgsc%2FntLink/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bcgsc%2FntLink/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bcgsc","download_url":"https://codeload.github.com/bcgsc/ntLink/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252826870,"owners_count":21810196,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["assembly","bioinformatics","long-reads","minimizers","scaffolding"],"created_at":"2024-11-09T08:27:33.628Z","updated_at":"2026-02-26T20:41:26.482Z","avatar_url":"https://github.com/bcgsc.png","language":"Python","readme":"![GitHub release (latest by date)](https://img.shields.io/github/v/release/bcgsc/ntlink)\n![Conda](https://img.shields.io/conda/dn/bioconda/ntlink?label=Conda)\n[![Published in BMC Bioinformatics](https://img.shields.io/badge/Published%20in-BMC%20Bioinformatics-blue)](https://doi.org/10.1186/s12859-021-04451-7)\n[![Published in Current Protocols](https://img.shields.io/badge/Published%20in-Current%20Protocols-blue)](https://doi.org/10.1002/cpz1.733)\n\n![Logo](https://github.com/bcgsc/ntLink/blob/master/ntlink-logo.png)\n\n## Minimizer-based genome assembly scaffolding and mapping using long reads and minimizers\n\n## Description of the algorithm\nntLink uses minimizers to perform a lightweight mapping between the input target assembly and the supplied long reads. These long-read mappings are then used as evidence to orient and order the output scaffolds.\n\n### General steps in the algorithm:\n1. Compute ordered minimizer sketches of the input target assembly and long reads\n2. Use minimizers to map the long reads to the target assembly contigs\n3. Find contig pairs, where joins are suggested by the long-read mapping evidence\n4. Output a scaffold graph, where the nodes are oriented contigs and the edges are joins suggested by the long read data\n5. Traverse the scaffold graph using `abyss-scaffold` to output the final scaffolds\n\n## Credits\nConcept: Rene Warren and Lauren Coombe\n\nDesign and implementation: Lauren Coombe\n\n## Citing ntLink\nIf you use ntLink in your research, please cite:\n\nCoombe L, Li JX, Lo T, Wong J, Nikolic V, Warren RL and Birol I. LongStitch: high-quality genome assembly correction and scaffolding using long reads. BMC Bioinformatics 22, 534 (2021). https://doi.org/10.1186/s12859-021-04451-7\n\nCoombe L, Warren RL, Wong J, Nikolic V and Birol I. ntLink: A toolkit for de novo genome assembly scaffolding and mapping using long reads. Current Protocols 3, e733 (2023). https://doi.org/10.1002/cpz1.733\n\n\n## Usage\n```\nntLink: Scaffolding assemblies using long reads\nUsage: ntLink scaffold target=\u003ctarget scaffolds\u003e reads='List of long read files'\n\nTo additionally run gap-filling (fill gap regions with raw read sequence):\nUsage: ntLink scaffold gap_fill target=\u003ctarget scaffolds\u003e reads='List of long read files'\n\nOptions:\ntarget\t\t\tTarget assembly to be scaffolded in fasta format\nreads\t\t\tList of long read files (separated by a space)\nprefix\t\t\tPrefix of intermediate output files [\u003ctarget\u003e.k\u003ck\u003e.w\u003cw\u003e.z\u003cz\u003e]\nt\t\t\tNumber of threads [4]\nk\t\t\tK-mer size for minimizers [32]\nw\t\t\tWindow size for minimizers [100]\nn\t\t\tMinimum graph edge weight [1]\ng\t\t\tMinimum gap size (bp) [20]\nG\t\t\tMaximum gap size (bp). -1 indicates no maximum [-1]\nf\t\t\tMaximum number of contigs in a run for full transitive edge addition [10]\na\t\t\tMinimum number of anchored ONT reads required for an edge [1]\nz\t\t\tMinimum size of contig (bp) to scaffold [1000]\nv\t\t\tIf 1, track time and memory for each step of the pipeline [0]\npaf\t\t\tIf True, outputs read to contig mappings in PAF-like format [False]\noverlap\t\t\tIf True, runs extra step to attempt to identify and trim overlapping joined sequences [True]\nsensitive\t    \tIf True, runs mapping in sensitive mode [False]\nsoft_mask\t\tIf True, gaps are filled with lowercase bases [False]\n\nNote: \n\t- Ensure all assembly and read files are in the current working directory, making soft links if necessary\n```\n\nRunning `ntLink help` prints the help documentation.\n\n* Input reads files can be gzipped (or not), and in either fastq or fasta format\n\n### Example\nInput files:\n* target assembly `my_assembly.fa` \n* long read file `long_reads.fq.gz`\n\nntLink command:\n```\nntLink scaffold target=my_assembly.fa reads=long_reads.fq.gz k=32 w=250\n```\n\nThe post-ntLink scaffolds file will have the suffix `*ntLink.scaffolds.fa`\n\nSee our [wiki](https://github.com/bcgsc/ntLink/wiki) for more information about output file formats.\n\n\n### Gap-filling\nAs of ntLink v1.2.0, ntLink can also run gap-filling after the scaffolding stage. This mode is enabled by adding the `gap_fill` target to the `ntLink` command. `overlap=True` is required when using the `gap_fill` feature.\n\nNote that the gaps will be filled with raw read sequence, so subsequent polishing is a good idea. See the wiki page for more details.\n\n### Rounds\n\nTo maximize the scaffolding gains, ntLink can be run iteratively in rounds. As of ntLink v1.3.0, these rounds can be launched using the `ntLink_rounds` Makefile, which uses mapping liftover to reduce the computational cost of additional ntLink rounds.\n\nExample command without gap-filling (target `run_rounds_gaps` runs gap-filling, while `run_rounds` does not), running 5 rounds of ntLink:\n```\nntLink_rounds run_rounds target=my_assembly.fa reads=long_reads.fq.gz k=24 w=250 rounds=5\n```\nSee the wiki page for more details.\n\n### Mapping only\n\nTo only run the pairing stage of `ntLink` (the stage where the long reads are mapped to the contigs), use the `pair` target for the `ntLink` command. The mappings can also be output in PAF-like format by specifying `paf=True`.\n\n\n**For more information about the ntLink algorithm and tips for running ntLink see our [wiki](https://github.com/bcgsc/ntLink/wiki)**\n\n ## Installation\n ntLink is available from conda and homebrew package managers.\n \n Installing using conda:\n ```\n conda install -c bioconda -c conda-forge ntlink\n ```\n \n Installing using brew:\n ```\n brew install brewsci/bio/ntlink\n ```\n \n Installing from source code:\n ```\ncurl -L --output ntLink-1.3.11.tar.gz https://github.com/bcgsc/ntLink/releases/download/v1.3.11/ntLink-1.3.11.tar.gz \u0026\u0026 tar xvzf ntLink-1.3.11.tar.gz \n```\n\n#### Testing your installation\nTo test your ntLink installation:\n```\ncd tests\n./test_installation.sh\n```\nThe expected output files can be found in: `tests/expected_outputs`\n\n## Dependencies\n* Python 3.7+ ([Numpy](https://numpy.org/), [Python-igraph](https://igraph.org/python/))\n* [btllib](https://github.com/bcgsc/btllib) 1.6.2 or lower\n* [ABySS v2.3.0+](https://github.com/bcgsc/abyss)\n* GCC 6+ or Clang 5+ with OpenMP\n* [zlib](https://zlib.net/)\n\nPython dependencies can be installed with:\n```\nconda install -c bioconda --file requirements.txt\n```\n\n## License\nntLink Copyright (c) 2020-present British Columbia Cancer Agency Branch. All rights reserved.\n\nntLink is released under the GNU General Public License v3\n\nThis program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 3.\n\nThis program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.\n\nYou should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.\n\nFor commercial licensing options, please contact Patrick Rebstein (prebstein@bccancer.bc.ca).\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbcgsc%2Fntlink","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbcgsc%2Fntlink","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbcgsc%2Fntlink/lists"}