Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/raphael-group/NAIBR
Novel Adjacency Identification with Barcoded Reads
https://github.com/raphael-group/NAIBR
Last synced: 2 months ago
JSON representation
Novel Adjacency Identification with Barcoded Reads
- Host: GitHub
- URL: https://github.com/raphael-group/NAIBR
- Owner: raphael-group
- License: mit
- Created: 2017-05-09T01:01:51.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2022-04-19T09:40:25.000Z (over 2 years ago)
- Last Synced: 2024-07-31T20:29:55.626Z (5 months ago)
- Language: Python
- Size: 7.72 MB
- Stars: 14
- Watchers: 13
- Forks: 4
- Open Issues: 19
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-linked-reads - NAIBR - commit/raphael-group/NAIBR?label=%20)<br />![GitHub last commit](https://img.shields.io/github/last-commit/pontushojer/NAIBR?label=%20) (Tools)
README
## Overview
NAIBR (Novel Adjacency Identification with Barcoded Reads) identifies novel adjacencies created by structural variation events such as deletions, duplications, inversions, and complex rearrangements using linked-read whole-genome sequencing data produced by 10X Genomics. Please refer to the [publication](https://doi.org/10.1093/bioinformatics/btx712) for details about the method.
NAIBR takes as in put a BAM file produced by 10X Genomic's Long Ranger pipeline and outputs a BEDPE file containing predicted novel adjacencies and a likelihood score for each adjacency.
## Installing NAIBR
```
git clone https://github.com/raphael-group/NAIBR.git
```NAIBR is written in python 2.7 and requires the following dependencies: pysam, numpy, scipy, subprocess, and matplotlib
## Running NAIBR
NAIBR can be run using the following command:
```
python NAIBR.py
```A template config file can be found in example/example.config. The following parameters can be set in the config file:
* bam_file: Input BAM file < required >
* min_mapq: Minimum mapping quality for a read to be included in analysis (default: 40)
* outdir: Output directory (default: . )
* d: The maximum distance between reads in a linked-read
* blacklist: tap separated list of regions to be excluded from analysis (default: None)
* candidates: List in BEDPE format of novel adjacencies to be scored by NAIBR. This will override automatic detection of candidate novel adjacencies.
* threads: Number of threads (default: 1)
* min_sv: Minimum size of a structural variant to be detected (default: lmax, the 95th percentile of the paired-end read insert size distribution)
* k: minimum number of barcode overlaps supporting a candidate NA (default = 3)## Output
NAIBR outputs a BEDPE file containing all novel scored novel adjacencies. Predicted novel adjacencies with scores greater than the threshold c are labelled 'PASS' and others are labelled 'FAIL'.
## Example
Example files are provided in the 'example' directory. Running```
python NAIBR.py example/example.config
```will produce the file 'example/NAIBR_SVs.bedpe'.
### Citing NAIBR
Elyanow, Rebecca, Hsin-Ta Wu, and Benjamin J. Raphael. "Identifying structural variants using linked-read sequencing data." Bioinformatics (2017).
```
@article{elyanow2017identifying,
title={Identifying structural variants using linked-read sequencing data},
author={Elyanow, Rebecca and Wu, Hsin-Ta and Raphael, Benjamin J},
journal={Bioinformatics},
year={2017}
}
```