https://github.com/bcgsc/link_str
Analysis scripts developed for genotyping STRs in linked-read data
https://github.com/bcgsc/link_str
Last synced: 10 months ago
JSON representation
Analysis scripts developed for genotyping STRs in linked-read data
- Host: GitHub
- URL: https://github.com/bcgsc/link_str
- Owner: bcgsc
- License: gpl-3.0
- Created: 2021-08-24T20:20:19.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2022-03-17T21:12:01.000Z (almost 4 years ago)
- Last Synced: 2024-07-31T20:29:00.473Z (over 1 year ago)
- Language: Python
- Size: 1.78 MB
- Stars: 1
- Watchers: 6
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-linked-reads - Link_STR - read data| (Tools)
README
# Genotyping STRs in linked-read data
This repository contains Python scripts developed to:
- extract in-repeat repeats (IRRs) using barcodes from linked-read alignments ([IRR extraction](irr))
- estimate sizes of genomic intervals by calculating Jaccard index (JI) of barcode sharing ([distance estimate](jaccard_index))
## Dependancies
- [NumPy](https://numpy.org/)
- [Pandas](https://pandas.pydata.org/)
- [pybedtools](https://daler.github.io/pybedtools/)
- [pysam](https://github.com/pysam-developers/pysam)
- [TRF](https://tandem.bu.edu/trf/trf.html) (for IRR extraction)
- [blastn](https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/) (for IRR extraction)
Author: [Readman Chiu](mailto:rchiu@bcgsc.ca)
:copyright: Canada's Michael Smith Genome Sciences Centre, BC Cancer