https://github.com/wckdouglas/liftover_bam
https://github.com/wckdouglas/liftover_bam
bam-files dna-seq ngs-analysis rna-seq
Last synced: 3 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/wckdouglas/liftover_bam
- Owner: wckdouglas
- Created: 2021-12-03T03:17:34.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2022-08-08T01:39:42.000Z (almost 3 years ago)
- Last Synced: 2025-02-03T09:45:34.943Z (5 months ago)
- Topics: bam-files, dna-seq, ngs-analysis, rna-seq
- Language: Python
- Homepage:
- Size: 55.7 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Lifting over bam #
[](https://github.com/wckdouglas/liftover_bam/actions/workflows/ci.yml) [](https://badge.fury.io/py/liftover-bam)
Sometimes for amplicon sequencings, we would like to map reads to the amplicon sequence only but bringing them back to genomic coordinates for easy variant calling and viewing.
Let's say we have a gene in `chr1:100-1000`, and we would first extract this locus from the genome fasta file to make a new fasta record with name `>chr1:100-1000`, this can be done with:
```
echo "chr1:100-1000" | samtools faidx -r - genome.fa > gene.fa
```
and map the reads to this single gene fasta file with `bwa` or `bowtie2` to make a bam alignment file:
```
bwa mem gene.fa query.fq | samtools view -b > gene.bam
```So what if you want to put these alignments back to the genomic coordinates after that?
The `liftover_bam.liftover` function is trying to solve this problem in pure python!
```
gene_bam="gene.bam"
genome_bam="any.bam_file_that_maps_to_the_genome"
out_bam="where_you_want_your_output_bam_file_to_be"
liftover(gene_bam, genome_bam, out_bam)
```