Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/pontushojer/awesome-linked-reads

Collection of tools and resources for linked-reads
https://github.com/pontushojer/awesome-linked-reads

List: awesome-linked-reads

10x 10x-genomics awesome-list dbs haplotagging linked-reads stlfr tell-seq

Last synced: 3 months ago
JSON representation

Collection of tools and resources for linked-reads

Awesome Lists containing this project

README

        

# Awesome Linked Reads

This is a collection of tools and resources for analysis and processing of linked-reads.

- [Tools](#tools)
- [Linked Read Platforms](#linked-read-platforms)

## Tools
Name|Category|Description|Last commit
---|---|:---|-----:
[Aldy](https://github.com/0xTCG/aldy)|structural variants, variant calling|Allelic decomposition and exact genotyping of highly polymorphic and structurally variant genes|![GitHub last commit](https://img.shields.io/github/last-commit/0xTCG/aldy?label=%20)
[Ambigram](https://github.com/deepomicslab/Ambigram)|structural variants|Detection of complex breakage-fusion-bridge genome rearrangements that supports linked-reads|![GitHub last commit](https://img.shields.io/github/last-commit/deepomicslab/Ambigram?label=%20)
[Aquila](https://github.com/maiziex/Aquila)|assembly, pipeline|Diploid personal genome assembly and comprehensive variant detection based on linked-reads|![GitHub last commit](https://img.shields.io/github/last-commit/maiziex/Aquila?label=%20)
[Aquila_stLFR](https://github.com/maiziex/Aquila_stLFR)|assembly, pipeline|Human haplotype-resolved assembly and variant detection for stLFR, hybrid assembly for linked-reads|![GitHub last commit](https://img.shields.io/github/last-commit/maiziex/Aquila_stLFR?label=%20)
[AquilaDeepFilter](https://github.com/maiziezhoulab/AquilaDeepFilter)|structural variants|Deep learing based filtering of genome-wide false positive large deletions|![GitHub last commit](https://img.shields.io/github/last-commit/maiziezhoulab/AquilaDeepFilter?label=%20)
[AquilaSV](https://github.com/maiziezhoulab/AquilaSV)|structural variants, variant calling|Structural variant detection from region-based phased diploid assemblies for 10X and stLFR linked-reads|![GitHub last commit](https://img.shields.io/github/last-commit/maiziezhoulab/AquilaSV?label=%20)
[ARBitR](https://github.com/markhilt/ARBitR)|scaffolding|ARBitR is an overlap aware genome assembly scaffolder for linked sequencing reads.|![GitHub last commit](https://img.shields.io/github/last-commit/markhilt/ARBitR?label=%20)
[Ariadne](https://github.com/lauren-mak/ariadne)|assembly, metagenomics|de Bruijn graph-based program for barcoded read deconvolution|![GitHub last commit](https://img.shields.io/github/last-commit/lauren-mak/ariadne?label=%20)
[arcs](https://github.com/bcgsc/arcs)|assembly|Scaffold genome sequence assemblies. |![GitHub last commit](https://img.shields.io/github/last-commit/bcgsc/arcs?label=%20)
[Athena](https://github.com/abishara/athena_meta)|assembly, metagenomics|Read cloud assembler for metagenomes|![GitHub last commit](https://img.shields.io/github/last-commit/abishara/athena_meta?label=%20)
[BarCrawler](https://github.com/J35P312/BarCrawler)|qc|QC package for 10X genomics barcoded reads.|![GitHub last commit](https://img.shields.io/github/last-commit/J35P312/BarCrawler?label=%20)
[bcctools](https://github.com/kehrlab/bcctools)|toolkit|Correcting barcodes in 10X linked-read sequencing data|![GitHub last commit](https://img.shields.io/github/last-commit/kehrlab/bcctools?label=%20)
[bcmap](https://github.com/kehrlab/bcmap)|mapping, toolkit|Fast tool to map approximate genome locations for barcoded molecules|![GitHub last commit](https://img.shields.io/github/last-commit/kehrlab/bcmap?label=%20)
[BLR](https://github.com/AfshinLab/BLR)|pipeline|An end-to-end Snakemake workflow for whole genome haplotyping and structural variant calling from FASTQs from multiple linked-read technologies.|![GitHub last commit](https://img.shields.io/github/last-commit/AfshinLab/BLR?label=%20)
[bxtools](https://github.com/walaj/bxtools)|toolkit| Tools for analyzing mapped 10x data|![GitHub last commit](https://img.shields.io/github/last-commit/walaj/bxtools?label=%20)
[cloudSPAdes](https://github.com/ablab/spades/tree/cloudspades-ismb)|assembly|Assembly of synthetic long reads using de Bruijn graphs|![GitHub last commit (branch)](https://img.shields.io/github/last-commit/ablab/spades/cloudspades-ismb?label=%20)
[ChromeQC](https://github.com/bcgsc/chromeqc)|qc|Summarize sequencing library quality of 10x Genomics Chromium linked reads|![GitHub last commit](https://img.shields.io/github/last-commit/bcgsc/chromeqc?label=%20)
[Cue](https://github.com/PopicLab/cue)|structural variants|Deep learning framework for SV calling and genotyping|![GitHub last commit](https://img.shields.io/github/last-commit/PopicLab/cue?label=%20)
[DrLink](https://github.com/schneebergerlab/DrLink)|structural variants|Detecting recombination breakpoints using Linked read sequencing|![GitHub last commit](https://img.shields.io/github/last-commit/schneebergerlab/DrLink?label=%20)
[EMerAld (EMA)](https://github.com/arshajii/ema)|mapping|Preforms barcode-aware alignment of linked reads. Also does preprocessing of 10x Genomics data.|![GitHub last commit](https://img.shields.io/github/last-commit/arshajii/ema?label=%20)
[Gemtools](https://github.com/sgreer77/gemtools)|toolkit|Tools for working with linked-read sequencing (10X Genomics) data|![GitHub last commit](https://img.shields.io/github/last-commit/sgreer77/gemtools?label=%20)
[grocsvs](https://github.com/grocsvs/grocsvs)|structural variants|Genome-wide reconstruction of complex structural variants|![GitHub last commit](https://img.shields.io/github/last-commit/grocsvs/grocsvs?label=%20)
[HapCUT2](https://github.com/vibansal/HapCUT2)|phasing|Phasing of barcode linked reads|![GitHub last commit](https://img.shields.io/github/last-commit/vibansal/HapCUT2?label=%20)
[HapTree-X](https://github.com/0xTCG/haptreex)|phasing|Haplotype phaser for next-generation sequencing data|![GitHub last commit](https://img.shields.io/github/last-commit/0xTCG/haptreex?label=%20)
[HARPY](https://github.com/pdimens/harpy)|pipeline|Process raw haplotagging data, from raw sequences to phased haplotypes, batteries included.|![GitHub last commit](https://img.shields.io/github/last-commit/pdimens/harpy?label=%20)
[HAST](https://github.com/BGI-Qingdao/HAST)|assembly|Haplotype-Resolved Assembly for Synthetic Long Reads Using A Trio-Binning Strategy|![GitHub last commit](https://img.shields.io/github/last-commit/BGI-Qingdao/HAST?label=%20)
[Lancet](https://github.com/nygenome/lancet)|variant calling|Microassembly based somatic variant caller for linked-read data|![GitHub last commit](https://img.shields.io/github/last-commit/nygenome/lancet?label=%20)
[Lariat](https://github.com/10XGenomics/lariat)|mapping|Linked-Read Alignment Tool|![GitHub last commit](https://img.shields.io/github/last-commit/10XGenomics/lariat?label=%20)
[LEVIATHAN](https://github.com/morispi/LEVIATHAN)|structural variants|Linked-reads based structural variant caller with barcode indexing|![GitHub last commit](https://img.shields.io/github/last-commit/morispi/LEVIATHAN?label=%20)
[Link_STR](https://github.com/bcgsc/link_str)|toolkit|Analysis scripts developed for genotyping STRs in linked-read data|![GitHub last commit](https://img.shields.io/github/last-commit/bcgsc/link_str?label=%20)
[LinkedSV](https://github.com/WGLab/LinkedSV)|structural variants|Structural variant caller for linked-read sequencing data|![GitHub last commit](https://img.shields.io/github/last-commit/WGLab/LinkedSV?label=%20)
[Linker](https://github.com/rwtourdot/linker)|toolkit|Tools for analyzing long and linked read sequencing|![GitHub last commit](https://img.shields.io/github/last-commit/rwtourdot/linker?label=%20)
[LongRanger](https://github.com/10XGenomics/longranger)|pipeline|Pipeline for alignment, variant calling, phasing, and ptructural variant calling|![GitHub last commit](https://img.shields.io/github/last-commit/10XGenomics/longranger?label=%20)
[LRez](https://github.com/morispi/LRez)|toolkit|Standalone tool and library allowing to work with barcoded linked-reads|![GitHub last commit](https://img.shields.io/github/last-commit/morispi/LRez?label=%20)
[LRTK-SIM](https://github.com/zhanglu295/LRTK-SIM)|simulation|A program to simulate linked reads sequencing from 10X Chromium System|![GitHub last commit](https://img.shields.io/github/last-commit/zhanglu295/LRTK-SIM?label=%20)
[LRSIM](https://github.com/aquaskyline/LRSIM)|simulation|A simulator for linked reads|![GitHub last commit](https://img.shields.io/github/last-commit/aquaskyline/LRSIM?label=%20)
[MetaTrass](https://github.com/BGI-Qingdao/MetaTrass)|assembly|Taxonomic Reads Assembly For a Single Species to Metagenomics|![GitHub last commit](https://img.shields.io/github/last-commit/BGI-Qingdao/MetaTrass?label=%20)
[Minerva](https://github.com/dcdanko/minerva_barcode_deconvolution)|assembly|Sort Linked Read DNA Into Fragment Specific Clusters|![GitHub last commit](https://img.shields.io/github/last-commit/dcdanko/minerva_barcode_deconvolution?label=%20)
[mLinker](https://github.com/chengzhongzhangDFCI/GenomeBiology-mLinker) ([alt](https://github.com/rwtourdot/mlinker))|phasing, tookit| Tools for Determining Haplotype Phase from Long/Linked Read Sequencing |![GitHub last commit](https://img.shields.io/github/last-commit/chengzhongzhangDFCI/GenomeBiology-mLinker?label=%20)
[MTG-Link](https://github.com/anne-gcd/MTG-Link)|assembly|Novel gap-filling tool for draft genome assemblies, dedicated to linked read data|![GitHub last commit](https://img.shields.io/github/last-commit/anne-gcd/MTG-Link?label=%20)
[NAIBR](https://github.com/raphael-group/NAIBR) (original)
[NAIBR](https://github.com/pontushojer/NAIBR) (fork)|structural variants|Identifies novel adjacencies created by structural variation events such as deletions, duplications, inversions, and complex rearrangements|![GitHub last commit](https://img.shields.io/github/last-commit/raphael-group/NAIBR?label=%20)
![GitHub last commit](https://img.shields.io/github/last-commit/pontushojer/NAIBR?label=%20)
[Novel-X](https://github.com/1dayac/Novel-X)|structural variants|Novel insertion detection with 10X reads|![GitHub last commit](https://img.shields.io/github/last-commit/1dayac/Novel-X?label=%20)
[NPGREAT](https://github.com/eleniadam/npgreat)|assembly|A hybrid assembly method that utilizes Nanopore and Linked-Reads datasets for the assembly of the human subtelomere regions.|![GitHub last commit](https://img.shields.io/github/last-commit/eleniadam/npgreat?label=%20)
[Pangaea](https://github.com/ericcombiolab/Pangaea)|assembly, metagenomics|A metagenome assembler for the linked-reads with high-barcode specificity|![GitHub last commit](https://img.shields.io/github/last-commit/ericcombiolab/Pangaea?label=%20)
[proc10xG](https://github.com/ucdavis-bioinformatics/proc10xG)|toolkit|Collection of scripts for processing 10x genomics reads|![GitHub last commit](https://img.shields.io/github/last-commit/ucdavis-bioinformatics/proc10xG?label=%20)
[Pseudoseq](https://github.com/bioinfologics/Pseudoseq.jl)|simulation|Fake genomes, fake sequencing, real insights.|![GitHub last commit](https://img.shields.io/github/last-commit/bioinfologics/Pseudoseq.jl?label=%20)
[Pyslr](https://github.com/bcgsc/physlr)|assembly|Construct a Physical Map from Linked Reads|![GitHub last commit](https://img.shields.io/github/last-commit/bcgsc/physlr?label=%20)
[QuickDeconvolution](https://github.com/RolandFaure/QuickDeconvolution)|assembly|Quick and scalable software to deconvolve read clouds from linked-reads experiments without a reference genome |![GitHub last commit](https://img.shields.io/github/last-commit/RolandFaure/QuickDeconvolution?label=%20)
[Samovar](https://github.com/cdarby/samovar)|variant calling|Somatic (mosaic) SNV caller for 10X Genomics data using random forest classification and feature-based filters|![GitHub last commit](https://img.shields.io/github/last-commit/cdarby/samovar?label=%20)
[samplot](https://github.com/ryanlayer/samplot)|structural variants|Plot structural variant signals from many BAMs and CRAMs|![GitHub last commit](https://img.shields.io/github/last-commit/ryanlayer/samplot?label=%20)
[Scaff10x (v5)](https://github.com/wtsi-hpag/Scaff10X)
[Scaff10x (≤v4.1)](https://sourceforge.net/projects/phusion2/files/scaff10x/)|assembly| Pipeline for scaffolding and breaking a genome assembly | ![GitHub last commit](https://img.shields.io/github/last-commit/wtsi-hpag/Scaff10X?label=%20)
[SpecHLA](https://github.com/deepomicslab/SpecHLA)|phasing|Reconstructs entire diploid sequences of HLA genes and infers LOH events|![GitHub last commit](https://img.shields.io/github/last-commit/deepomicslab/SpecHLA?label=%20)
[SpLitteR](https://github.com/ablab/spades/tree/splitter-paper) ([alt](https://cab.spbu.ru/software/splitter/))|assembly|Repeat resolution in assembly graph using synthetic long reads|
[stLFRdenovo](https://github.com/BGI-biotools/stLFRdenovo)|assebly|De Novo assembly pipeline to deal with barcoded reads. It is based on Supernova, with a fastq parsing and sorting module constumized for stLFR data.|![GitHub last commit](https://img.shields.io/github/last-commit/BGI-biotools/stLFRdenovo?label=%20)
[stLFRsv](https://github.com/BGI-biotools/stLFRsv)|structural variants|Structure variation(SV) pipeline for stLFR co-barcode reads|![GitHub last commit](https://img.shields.io/github/last-commit/BGI-biotools/stLFRsv?label=%20)
[SuperNova](https://github.com/10XGenomics/supernova)|assembly|10x Genomics Linked-Read Diploid De Novo Assembler|![GitHub last commit](https://img.shields.io/github/last-commit/10XGenomics/supernova?label=%20)
[SVenX](https://github.com/vborjesson/SVenX)|structural variants|Pipeline for SV detection using 10X genomics data|![GitHub last commit](https://img.shields.io/github/last-commit/vborjesson/SVenX?label=%20)
[tenx_utils](https://github.com/friend1ws/tenx_utils)|toolkit|Utility functions for 10x data|![GitHub last commit](https://img.shields.io/github/last-commit/friend1ws/tenx_utils?label=%20)
[Tigmint](https://github.com/bcgsc/tigmint)|assembly|Correct misassemblies using Linked Reads|![GitHub last commit](https://img.shields.io/github/last-commit/bcgsc/tigmint?label=%20)
[TitanCNA_10x](https://github.com/GavinHaLab/TitanCNA_10X_snakemake)|pipeline,structural variants,cancer|Snakemake workflow for 10X Genomics WGS analysis using TitanCNA|![GitHub last commit](https://img.shields.io/github/last-commit/GavinHaLab/TitanCNA_10X_snakemake?label=%20)
[Topsorter](https://github.com/hanfang/Topsorter)|structural variants, qc|Graphic assement of structural variants|![GitHub last commit](https://img.shields.io/github/last-commit/hanfang/Topsorter?label=%20)
[VISOR](https://github.com/davidebolo1993/VISOR)|simulation|VarIant SimulatOR for short, long and linked reads|![GitHub last commit](https://img.shields.io/github/last-commit/davidebolo1993/VISOR?label=%20)
[Valor](https://github.com/BilkentCompGen/valor)|structural variants|Variation discovery using long range information in linked-reads|![GitHub last commit](https://img.shields.io/github/last-commit/BilkentCompGen/valor?label=%20)
[WhatsHap](https://github.com/whatshap/whatshap)|phasing,qc,toolkit|Read-based phasing of genomic variants, also called haplotype assembly. Implements several tools which work with linked reads|![GitHub last commit](https://img.shields.io/github/last-commit/whatshap/whatshap?label=%20)
[Wrath](https://github.com/annaorteu/wrath)|structural variants,qc|Visualisation and identification of candidate structural variants (SVs) from linked read data|![GitHub last commit](https://img.shields.io/github/last-commit/annaorteu/wrath?label=%20)
[xTea](https://github.com/parklab/xTea)|structural variants|Comprehensive TE insertion identification|![GitHub last commit](https://img.shields.io/github/last-commit/parklab/xTea?label=%20)
[ZoomX](https://bitbucket.org/charade/zoomx/)|structural variants|Single Molecule Based Rearrangement Analysis with Linked Read Sequencing |

## Linked Read Platforms

### 10x Genomics Chromium / GemCode
10x Genomics linked-read technology comes in two versions; the older [GemCode (v1)](https://doi.org/10.1038/nbt.3432) and more recent [Chromium Genome (v2)]( https://doi.org/10.1101/gr.234443.118>). Long DNA fragments are combined in droplets with barcode-containing gel-beads to create GEMs ((Gel Bead-In EMulsions). The fragments are amplified and barcoded using a combination of free random hexamers and barcode-linked random hexamers from the gel beads. Following this barcoded fragments are recovered and fragments before ligation of 3' sequencing adaptor. Libraries are sequenced using Illumina Sequencing. The commercial version of the technology is [currently discontinued](https://www.10xgenomics.com/products/linked-reads).

### TELL-Seq
TELL-seq is based on the technology from [Chen et al. 2020](https://doi.org/10.1101/gr.260380.119) and is commercially available from the company [Universal Sequencing](https://www.universalsequencing.com/). The method uses clonaly barcode beads with attacted tagmentases to cut and barcode individual long DNA fragments in solution. A second tagmentation is also preformed in solution to introduce a second adaptor. The library is sequenced using Illumina sequencing with special setup to sequence the barcode as index 1.

### stLFR
stLFR (single-tube long fragment read) is based on the technology described in [Wang et al. 2019](https://doi.org/10.1101/gr.245126.118) and is commercially available from [MGI](https://en.mgi-tech.com/products/reagents_info/18/). The technology uses tagmentation to individually cut-and-hold long DNA fragments in solution. The tagmentase-DNA complex is then hybridized and individual wrapped around barcoded beads through the adaptor introduced by the tagmentation. The barcode is then ligated to each subfragment before recovery and final library prepration. Sequencing is preformed on the DNBSEQ platfroms.

### DBS
Droplet Barcode Sequencing (DBS) is based on the technology described in [Redin el al. 2019](https://doi.org/10.1038/s41598-019-54446-x). Long DNA fragments are subjected to tagmentation using Tn5-covered beads to cut, tag and wrap the fragment around the beads. The DNA-wrapped beads are then used in emmulsion PCR along with barcoded oligo. Within each emmulsion droplet the barcode and tagged fragments are amplified independently and then linked using overlap-extension. Barcode-linked fragments are recovered and indexed for Illumina sequencing.

### CPT-seq
Technologies based [Amini et al. 2014](https://doi.org/10.1038/ng.3119) and the follow-up CPTv2-seq from [Zhang et al. 2017](https://doi.org/10.1038/nbt.3897). These technologies were developed by Illumina but are not commercially available.

### Haplotagging
Haplotagging is based on the technology presented in [Meier et al. 2021](https://doi.org/10.1073/pnas.2015005118). The technology uses barcoded beads
covered with Tn5 tagmentase to cut and barcode individual long DNA fragments in solution. The beads are coated in a combination of two barcodes AB and CB
that become inserted at the 5' and 3' of each cut fragment. Barcodes are combinatorialy generated with about 85 million possible combinations in total.

## Contributions

Is some linked-read related tool missing from this resource? Either create a new [issue](https://github.com/pontushojer/awesome-linked-reads/issues/new/choose) with information about the tool you want to add or submit a pull request with the addition directly.

## Credits

Inspired by the collection in [Awesome-10x-genomics](https://github.com/johandahlberg/awesome-10x-genomics/).