Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/hanfang/Topsorter
Graphical assessment of structrial variants using 10x genomics data
https://github.com/hanfang/Topsorter
genomics graph topological-sort
Last synced: 3 months ago
JSON representation
Graphical assessment of structrial variants using 10x genomics data
- Host: GitHub
- URL: https://github.com/hanfang/Topsorter
- Owner: hanfang
- Created: 2016-10-26T20:49:11.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2017-02-23T22:19:03.000Z (almost 8 years ago)
- Last Synced: 2024-07-31T20:31:35.162Z (6 months ago)
- Topics: genomics, graph, topological-sort
- Language: Python
- Homepage:
- Size: 67.4 KB
- Stars: 10
- Watchers: 6
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-linked-reads - Topsorter - commit/hanfang/Topsorter?label=%20) (Tools)
README
# Topsorter
### - Graphical assessment of structrial variants using 10x genomics data
####![alt text](https://github.com/hanfang/Topsorter/blob/master/image/topsorter.png)
### Contact
- Han Fang ([email protected])
- Srividya Ramakrishnan ([email protected])
- Fritz Sedlazeck ([email protected])--------
### Topsorter
#### Traversing & finding the longest path (most confident haplotype) in a weighted directed acyclic graph (DAG).
- At the initial construction step, it splits regions of a chromosome by structural variants and creates a weighted DAG.
- Then it updates the weights of the edges according to barcodes information (and/or other quality metrics) from the script barcode_profiles.py.
- Finally Topsorter performs topological sorting of the graph and finds the longest path (the most confident haplotype).
- Input: vcf file
- Output: PDF files of graphs for each chromosme, longest paths
- Command: `python topsorter.py $vcf `--------
### BarcodeProfiler
#### Building barcode profile for the alignments and count the overlapping barcodes for every split region of a chromosome
- Extract alignments from every split region in to a bam file and index them
- Identify the barcodes in each of these split regions and count number of reads per barcode
- Count the barcode overlaps between the split regions of interest
- Input: Phased bam file from 10x data, Constructed split regions in bed format using Topsorter class (func exportVCFBed )
- Output: directory containing files
overlapping_reads.bam
overlapping_reads.bam.bai
reads_barcode_profile.txt
barcode_overlaps_between_regions.txt
- Command: ./barcode_profiles.py -bam \ -bed \ -o \