Giter Club home page Giter Club logo

outrigger's Introduction

OutriggerLogo

BuildStatuscodecov PyPIVersions PythonVersionCompatibility

Large-scale detection and calculation of alternative splicing with Outrigger

Outrigger is a program which uses junction reads from RNA seq data, and a graph database to create a de novo alternative splicing annotation with a graph database, and quantify percent spliced-in (Psi) of the events.

Features

  • Finds novel splicing events, including novel exons! (outrigger index) from .bam files
  • (optional) Validates that exons have correct splice sites, e.g. GT/AG and AT/AC for mammalian systems (outrigger validate)
  • Calculate "percent spliced-in" (Psi/Ψ) scores for all your samples given the validated events (or the original events if you opted not to validate) via outrigger psi

OutriggerOverview

Installation

To install outrigger, we recommend using the Anaconda Python Distribution and creating an environment.

Quick install

(Requires anaconda)

make install

Then, activate the anaconda environment to put outrigger commands in your $PATH

source activate outrigger-env

Manual Install

You'll want to add the bioconda channel to make installing bedtools and its Python wrapper, pybedtools easy (these programs are necessary for both outrigger index and outrigger validate).

conda config --add channels r
conda config --add channels bioconda

Create an environment called outrigger-env. Python 2.7, Python 3.4, and Python 3.5 are supported.

conda create --name outrigger-env outrigger

Now activate that environment:

source activate outrigger-env

And run the installation:

pip install .

Validate Installation

To check that it installed properly, try the command with the help option (-h), outrigger -h. The output should look like this:

$ outrigger -h
usage: outrigger [-h] [--version] {index,validate,psi} ...

outrigger (1.0.0dev). Calculate "percent-spliced in" (Psi) scores of
alternative splicing on a *de novo*, custom-built splicing index -- just for
you!

positional arguments:
  {index,validate,psi}  Sub-commands
    index               Build an index of splicing events using a graph
                        database on your junction reads and an annotation
    validate            Ensure that the splicing events found all have the
                        correct splice sites
    psi                 Calculate "percent spliced-in" (Psi) values using the
                        splicing event index built with "outrigger index"

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit

Bleeding edge code from Github (here)

For advanced users, if you have git and Anaconda Python installed, you can:

  1. Clone this repository
  2. Change into that directory
  3. Create an environment named outrigger-env with the necessary packages from Anaconda and the Python Package Index (PyPI).
  4. Activate the environment

These steps are shown in code below.

git clone https://github.com/YeoLab/outrigger.git
cd outrigger
conda env create --file environment.yml
source activate outrigger-env

Quick start

If you just want to know how to run this on your data with the default parameters, start here. Let's say you performed your alignment in the folder called ~/projects/tasic2016/analysis/tasic2016_v1, and that's where your SJ.out.tab files from the STAR aligner are (they're output into the same folder as the .bam files). First you'll need to change directories to that folder with cd.

cd ~/projects/tasic2016/analysis/tasic2016_v1

Then you need find all alternative splicing events, which you do by running outrigger index on the splice junction files and the gtf. Here is an example command:

Input: .SJ.out.tab files

outrigger index --sj-out-tab *SJ.out.tab \
    --gtf /projects/ps-yeolab/genomes/mm10/gencode/m10/gencode.vM10.annotation.gtf

Input: .bam files

If you're using .bam files instead of SJ.out.tab files, never despair! Below is an example command. Keep in mind that for this program to work, the events must be sorted and indexed.

outrigger index --bam *sorted.bam \
    --gtf /projects/ps-yeolab/genomes/mm10/gencode/m10/gencode.vM10.annotation.gtf

Next, you'll want to validate that the splicing events you found follow biological rules, such as being containing GT/AG (mammalian major spliceosome) or AT/AC (mammalian minor splicesome) sequences. To do that, you'll need to provide the genome name (e.g. mm10) and the genome sequences. An example command is below:

outrigger validate --genome mm10 \
    --fasta /projects/ps-yeolab/genomes/mm10/GRCm38.primary_assembly.genome.fa

Finally, you can calculate percent spliced in (Psi) of your splicing events! Thankfully this is very easy:

outrigger psi

It should be noted that ALL of these commands should be performed in the same directory, so no moving.

Quick start summary

Here is a summary the commands in the order you would use them for outrigger!

cd ~/projects/tasic2016/analysis/tasic2016_v1
outrigger index --sj-out-tab *SJ.out.tab \
    --gtf /projects/ps-yeolab/genomes/mm10/gencode/m10/gencode.vM10.annotation.gtf
outrigger validate --genome mm10 \
    --fasta /projects/ps-yeolab/genomes/mm10/GRCm38.primary_assembly.genome.fa
outrigger psi

This will create a folder called outrigger_output, which at the end should look like the one below. Each file and folder is annotated with which command produced it.

$ tree outrigger_output
outrigger_output..........................................................index
├── index.................................................................index
│   ├── gtf...............................................................index
│   │   ├── gencode.vM10.annotation.gtf...................................index
│   │   ├── gencode.vM10.annotation.gtf.db................................index
│   │   └── novel_exons.gtf...............................................index
│   ├── exon_direction_junction_triples.csv...............................index
│   ├── mxe...............................................................index
│   │   ├── event.bed.....................................................index
│   │   ├── events.csv....................................................index
│   │   ├── exon1.bed.....................................................index
│   │   ├── exon2.bed.....................................................index
│   │   ├── exon3.bed.....................................................index
│   │   ├── exon4.bed.....................................................index
│   │   ├── intron.bed....................................................index
│   │   ├── splice_sites.csv...........................................validate
│   │   └── validated..................................................validate
│   │       └── events.csv.............................................validate
│   └── se................................................................index
│       ├── event.bed.....................................................index
│       ├── events.csv....................................................index
│       ├── exon1.bed.....................................................index
│       ├── exon2.bed.....................................................index
│       ├── exon3.bed.....................................................index
│       ├── intron.bed....................................................index
│       ├── splice_sites.csv...........................................validate
│       └── validated..................................................validate
│           └── events.csv.............................................validate
├── junctions.............................................................index
│   ├── metadata.csv......................................................index
│   └── reads.csv.........................................................index
└── psi.....................................................................psi
    ├── mxe.................................................................psi
    |   ├── psi.csv.........................................................psi
    │   └── summary.csv.....................................................psi
    ├── outrigger_psi.csv...................................................psi
    └── se..................................................................psi
        ├── psi.csv.........................................................psi
        └── summary.csv.....................................................psi

10 directories, 26 files

outrigger's People

Contributors

olgabot avatar willingc avatar mlovci avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.