https://github.com/fabio-cunial/smaht_experiments
Studying the effect of long-read coverage on structural variants in tissue samples
https://github.com/fabio-cunial/smaht_experiments
long-read-sequencing nanopore-sequencing pacbio-sequencing somatic-variants structural-variants
Last synced: 2 months ago
JSON representation
Studying the effect of long-read coverage on structural variants in tissue samples
- Host: GitHub
- URL: https://github.com/fabio-cunial/smaht_experiments
- Owner: fabio-cunial
- Created: 2025-07-31T10:56:19.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2025-07-31T17:58:52.000Z (2 months ago)
- Last Synced: 2025-07-31T18:03:18.894Z (2 months ago)
- Topics: long-read-sequencing, nanopore-sequencing, pacbio-sequencing, somatic-variants, structural-variants
- Language: Dockerfile
- Homepage:
- Size: 5.86 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Effect of coverage on somatic SV calling
[Metadata of all SMaHT samples](https://docs.google.com/spreadsheets/d/11T_QpVq4XEfupEeGD9IW5oVt1our6u3KzEtP6LBEc5w/edit?usp=sharing) at the time of this study.
# Liver
We consider the following input:
* `ST001`: healthy liver sample, sequenced at ~230x with PacBio Revio (this includes a ~26x Fiber-seq BAM) and at ~210x with ONT PromethION 24 (coverages estimated from chr1). Data downloaded from [the benchmarking section of the data portal](https://data.smaht.org/data/benchmarking/donor-st001#liver).
* `SMHT001`: death caused by liver failure, alcohol abuse as a death circumstance. Sequenced at ~10x with PacBio Revio. Data downloaded from workspace [SMaHT_Short_Read_Long_Read_Analysis](https://app.terra.bio/#workspaces/smaht-gcc-short-read/SMaHT_Short_Read_Long_Read_Analysis/data), table `SMAHT001_collaborator_long_read`, field `sample_collaborator_id=SMHT001-3I-001A2`.For each ST001 technology, we merge all the BAMs, we take random samples at multiples of 10x, and on every such subsample we run `sniffles --mosaic` *requiring just two reads* to support a call (.e. we set `--mosaic-af-min` as a function of coverage). We only output calls that Sniffles considers somatic, i.e. that have `--mosaic-af-max=0.218` (the default). Each call is annotated with the IDs of the reads that support it. We only consider calls in the standard chromosomes.

At 10x, SMHT001 (liver failure, circles) does not have more raw calls than ST001:

## SMHT001 PacBio (liver failure)
There are only 14 total calls, all of which except one occur in a TR.### Potential candidates




### Calls unlikely to be somatic


### Calls near complex events

