An open API service indexing awesome lists of open source software.

https://github.com/fabio-cunial/smaht_experiments

Studying the effect of long-read coverage on structural variants in tissue samples
https://github.com/fabio-cunial/smaht_experiments

long-read-sequencing nanopore-sequencing pacbio-sequencing somatic-variants structural-variants

Last synced: 2 months ago
JSON representation

Studying the effect of long-read coverage on structural variants in tissue samples

Awesome Lists containing this project

README

          

# Effect of coverage on somatic SV calling

[Metadata of all SMaHT samples](https://docs.google.com/spreadsheets/d/11T_QpVq4XEfupEeGD9IW5oVt1our6u3KzEtP6LBEc5w/edit?usp=sharing) at the time of this study.

# Liver

We consider the following input:

* `ST001`: healthy liver sample, sequenced at ~230x with PacBio Revio (this includes a ~26x Fiber-seq BAM) and at ~210x with ONT PromethION 24 (coverages estimated from chr1). Data downloaded from [the benchmarking section of the data portal](https://data.smaht.org/data/benchmarking/donor-st001#liver).
* `SMHT001`: death caused by liver failure, alcohol abuse as a death circumstance. Sequenced at ~10x with PacBio Revio. Data downloaded from workspace [SMaHT_Short_Read_Long_Read_Analysis](https://app.terra.bio/#workspaces/smaht-gcc-short-read/SMaHT_Short_Read_Long_Read_Analysis/data), table `SMAHT001_collaborator_long_read`, field `sample_collaborator_id=SMHT001-3I-001A2`.

For each ST001 technology, we merge all the BAMs, we take random samples at multiples of 10x, and on every such subsample we run `sniffles --mosaic` *requiring just two reads* to support a call (.e. we set `--mosaic-af-min` as a function of coverage). We only output calls that Sniffles considers somatic, i.e. that have `--mosaic-af-max=0.218` (the default). Each call is annotated with the IDs of the reads that support it. We only consider calls in the standard chromosomes.

![](figures/15.png)

At 10x, SMHT001 (liver failure, circles) does not have more raw calls than ST001:

![](figures/17.png)
![](figures/18.png)

## SMHT001 PacBio (liver failure)
There are only 14 total calls, all of which except one occur in a TR.

### Potential candidates

![](figures/5.png)
![](figures/6.png)
![](figures/8.png)
![](figures/10.png)
![](figures/14.png)

### Calls unlikely to be somatic

![](figures/1.png)
![](figures/2.png)
![](figures/9.png)

### Calls near complex events

![](figures/3.png)
![](figures/7.png)