Giter Club home page Giter Club logo

genomap's Introduction

genomap

An easy to use tool to generate heatmap like tracks for the UCSC Genome Browser

Setting up the work envrionment

In order to use genomap.py you will need a working Python 3 including pandas, matplotlib, numpy and pyBigWig. The most straightforward way to get this, is to download and install miniconda and use the environment.yml file to generate a virtual environment containing everything we need.

git clone https://github.com/dmalzl/genomap.git
cd genomap
conda env create -f environment.yml
conda activate genomapy

Generating a bigWig file from your BAMs

The easiest way to generate a bigWig file from your alignments is to use the deepTools suites bamCoverage

bamCoverage -b <inputBAM> \
            -o <outputFileName> \
            -of bigwig \
            -bs 5000 \
            -p 16 \
            --ignoreDuplicates \
            --normalizeUsing CPM \
            --exactScaling

This will generate a coverage track with a 5kb tiling normalized to counts per million over the genome from your input BAM file.

Converting bigWig file to bedGraph with UCSC suitable RGB column

Now that we have our bigWig file, the next step is to generate a UCSC compatible bedGraph with an itemRGB column. This is done using the genomap.py script and is invoked as follows:

./genomap.py -i <bigwigFile> \
           -bs 5000 \
           --vmin 0 \
           --vmax p75 \
           --colormap coolwarm \
           -o <outputBedGraph>

This will turn the bigwig into a bedGraph containing 9 columns including the itemRGB column which encodes the bigWig values as RGB colors for the UCSC genome browser.

Converting bedGraph to bigBed

The last step is to convert the bedGraph to it's binary twin the bigBed. This is done using the UCSC kentUtils suite. Note that you need

cat <bedGraphFile> | sort -k1,1 -k2,2n > <sortedBedGraph>
bedToBigBed <sortedBedGraph> chrom.sizes <outputBigBed>

The chrom.sizes file is a generic tab-separated file containing two columns describing the name and the size of the chromosomes contained in the bedGraph file. This will also generate a PDF containing the colorbar corresponding to the colors in the itemRGB, which will be saved in the same directory as the outputBedGraph. Alternatively, one can use the --colorbarFile parameter to set a filepath manually.

Add to TrackHub on UCSC

The last step is to add the generated bigBed to you UCSC TrackHub using the following directives

track <trackName>
shortLabel      <trackShortLabel>
longLabel       <trackLongLabel>
bigDataUrl      <path/to/bigBed>
itemRgb         on
type    bigBed 9 .

General comment on usage

The UCSC Genome Browser is an online tool to display sequencing an other related data. The versatility also brings some caveats such as a requirement for restriction of colorspace in cases of the itemRGB column of bigBed files as well as the number of regions that can simultaneously be displayed, which seems to be restricted to 1000 regions. Thus, a general point for consideration is the size of the regions one wants to view on the browser, since the heatmap will turn black for regions that span more than 1000 bigBed bins. An example would be as follows:

Consider viewing a 10Mb region on would need at least a binsize of 10,000,000 / 1,000 = 10,000 in order to be able to enjoy the colored version of the bigBed.

genomap's People

Contributors

dmalzl avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.