https://github.com/dmalzl/genomap
An easy to use tool to generate heatmap like tracks for the UCSC Genome Browser
https://github.com/dmalzl/genomap
bioinformatics hacking ucsc-browser
Last synced: about 2 months ago
JSON representation
An easy to use tool to generate heatmap like tracks for the UCSC Genome Browser
- Host: GitHub
- URL: https://github.com/dmalzl/genomap
- Owner: dmalzl
- License: mit
- Created: 2021-06-24T10:20:14.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2021-06-24T15:15:56.000Z (almost 4 years ago)
- Last Synced: 2025-02-05T20:03:31.057Z (3 months ago)
- Topics: bioinformatics, hacking, ucsc-browser
- Language: Python
- Homepage:
- Size: 15.6 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# genomap
An easy to use tool to generate heatmap like tracks for the UCSC Genome Browser## Setting up the work envrionment
In order to use `genomap.py` you will need a working Python 3 including [`pandas`](https://pandas.pydata.org/), [`matplotlib`](https://matplotlib.org/), [`numpy`](https://numpy.org/) and [`pyBigWig`](https://github.com/deeptools/pyBigWig). The most straightforward way to get this, is to download and install [`miniconda`](https://docs.conda.io/en/latest/miniconda.html) and use the `environment.yml` file to generate a virtual environment containing everything we need.
```bash
git clone https://github.com/dmalzl/genomap.git
cd genomap
conda env create -f environment.yml
conda activate genomapy
```## Generating a bigWig file from your BAMs
The easiest way to generate a bigWig file from your alignments is to use the [deepTools](https://deeptools.readthedocs.io/en/develop/index.html) suites [`bamCoverage`](https://deeptools.readthedocs.io/en/develop/content/tools/bamCoverage.html)
```bash
bamCoverage -b \
-o \
-of bigwig \
-bs 5000 \
-p 16 \
--ignoreDuplicates \
--normalizeUsing CPM \
--exactScaling
```
This will generate a coverage track with a 5kb tiling normalized to counts per million over the genome from your input BAM file.## Converting bigWig file to bedGraph with UCSC suitable RGB column
Now that we have our bigWig file, the next step is to generate a UCSC compatible bedGraph with an itemRGB column. This is done using the `genomap.py` script and is invoked as follows:
```bash
./genomap.py -i \
-bs 5000 \
--vmin 0 \
--vmax p75 \
--colormap coolwarm \
-o
```
This will turn the bigwig into a bedGraph containing 9 columns including the itemRGB column which encodes the bigWig values as RGB colors for the UCSC genome browser.## Converting bedGraph to bigBed
The last step is to convert the bedGraph to it's binary twin the bigBed. This is done using the [UCSC kentUtils](https://github.com/ENCODE-DCC/kentUtils) suite. Note that you need
```bash
cat | sort -k1,1 -k2,2n >
bedToBigBed chrom.sizes
```
The chrom.sizes file is a generic tab-separated file containing two columns describing the name and the size of the chromosomes contained in the bedGraph file. This will also generate a PDF containing the colorbar corresponding to the colors in the itemRGB, which will be saved in the same directory as the outputBedGraph. Alternatively, one can use the `--colorbarFile` parameter to set a filepath manually.## Add to TrackHub on UCSC
The last step is to add the generated bigBed to you UCSC TrackHub using the following directives
```
track
shortLabel
longLabel
bigDataUrl
itemRgb on
type bigBed 9 .
```# General comment on usage
The UCSC Genome Browser is an online tool to display sequencing an other related data. The versatility also brings some caveats such as a requirement for restriction of colorspace in cases of the itemRGB column of bigBed files as well as the number of regions that can simultaneously be displayed, which seems to be restricted to 1000 regions. Thus, a general point for consideration is the size of the regions one wants to view on the browser, since the heatmap will turn black for regions that span more than 1000 bigBed bins. An example would be as follows:Consider viewing a 10Mb region on would need at least a binsize of 10,000,000 / 1,000 = 10,000 in order to be able to enjoy the colored version of the bigBed.