https://github.com/gersteinlab/annotation-k-mer-extractor
Using sliding window to extract k-mers for annotation (specfically exons) in GTF format
https://github.com/gersteinlab/annotation-k-mer-extractor
Last synced: 11 months ago
JSON representation
Using sliding window to extract k-mers for annotation (specfically exons) in GTF format
- Host: GitHub
- URL: https://github.com/gersteinlab/annotation-k-mer-extractor
- Owner: gersteinlab
- License: gpl-3.0
- Created: 2018-08-20T17:40:38.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2018-08-20T17:58:40.000Z (almost 8 years ago)
- Last Synced: 2025-03-13T16:45:55.394Z (about 1 year ago)
- Language: Python
- Homepage:
- Size: 19.5 KB
- Stars: 0
- Watchers: 25
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# exon-k-mer-extractor
Using sliding window to extract k-mers for annotation (specfically exons) in GTF format
## Dependencies:
1. Linux
2. Python 2.6
## Get Started:
### Input
python kmer_analysis.py -g {input_annotation_gtf}.gtf -k n -r m -o {output_bed}.bed
### Output BED Format
**column 1**: chromosome number (chr1, chr2, chr3...)
**column 2**: start location (1445550...)
**column 3**: end location (1445800...)
**column 4**: unique id (chr1-14403:14404:ED0...)
In column 4, there are four designations for relative position (IU, EU, ED, ID) followed by a number.
**IU** = intron upstream
**EU** = exon upstream
followed by a number indicating how many nucleotides away from the start location
**ED** = exon downstream
**ID** = intron downstream
followed by a number indicating how many nucleotides away from the end location