Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/txje/sequence-bias-adjustment
Method and utilities for reweighting sequence read counts to adjust for nucleotide bias from assay or sequencing
https://github.com/txje/sequence-bias-adjustment
Last synced: about 1 month ago
JSON representation
Method and utilities for reweighting sequence read counts to adjust for nucleotide bias from assay or sequencing
- Host: GitHub
- URL: https://github.com/txje/sequence-bias-adjustment
- Owner: txje
- License: mit
- Created: 2014-06-10T18:18:17.000Z (about 10 years ago)
- Default Branch: main
- Last Pushed: 2020-06-18T16:03:50.000Z (about 4 years ago)
- Last Synced: 2024-02-12T23:48:31.276Z (5 months ago)
- Language: Python
- Size: 438 KB
- Stars: 4
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Lists
- awesome-atac-analysis - sequence bias correction
README
Correcting nucleotide-specific bias in high-throughput sequencing data
----------------------------------------------------------------------Reweight high-throughput sequencing reads to account for nucleotide-specific bias
from any source, including assay and sequencing biases.Requirements:
* samtools (in PATH)
* Python (2.7)
* numpy
* pysam
* matplotlib (pyplot)Installation:
git clone http://github.com/txje/sequence-bias-adjustment
cd sequence-bias-adjustmentUsage:
sh seqbias_pipe.sh [--resume ]
ref reference genome FASTA file
chroms list of chromosomes (one per line) to correct
bam should be aligned and filtered to mask repetitive elements and such (if needed)
k tile size to correct (5 is recommended most of the time)
outdir directory to put all output files
prefix prepended to all intermediate and output filesExample -- to run bias correction on an example ENCODE DNase-seq data set:
git clone http://github.com/txje/sequence-bias-adjustment
cd sequence-bias-adjustment
mkdir example_data
cd example_data
wget http://hgdownload.soe.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeOpenChromDnase/wgEncodeOpenChromDnaseGm12878AlnRep1.bam
wget http://hgdownload.soe.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeOpenChromDnase/wgEncodeOpenChromDnaseGm12878AlnRep1.bam.bai
wget http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/hg19.chrom.sizes
wget http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/hg19.2bit
wget http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/twoBitToFa
chmod 700 twoBitToFa
./twoBitToFa hg19.2bit hg19.fa
cd ..
sh seqbias_pipe.sh example_data/hg19.fa example_data/hg19.chrom.sizes example_data/wgEncodeOpenChromDnaseGm12878AlnRep1.bam 5 example_results gm12878.dnaseJeremy Wang, Ph.D.
Department of Genetics
University of North Carolina at Chapel HillMIT License (see LICENSE.txt)