Giter Club home page Giter Club logo

ipafinder's Introduction

IPAFinder

Description

IPAFinder performs de novo identification and quantification of dynamic IpA events using standard RNA-seq, regardless of any prior poly(A) site annotation. Assuming there is an intronic poly(A) site used in a given intron, IPAFinder models the normalized single-nucleotide resolution RNA-seq read coverage profiles and identifies profound drop in coverage to infer the used poly(A) site. To detect skipped IpA, IPAFinder recognized cryptic 3′ splice site by junction-spanning reads and concatenated preceding exon to potential terminal exon. IPAFinder also has the ability to exclude alternative splicing events such as alternative 5′ splice site and cryptic exon activation by recognizing junction-spanning reads.

Diagram illuminates the IPAFinder algorithm.

Installation

IPAFinder consists of both Python (3.5+) and R scripts:

  1. Install the following software pre-requisites:

    i. python (required packages HTSeq, itertools, numpy, collections, multiprocessing, scipy, argparse, os, warnings, and subprocess)

    ii. R (required packages optparse, dplyr, stringr, and DEXSeq)

  2. Clone the lastest development version of IPAFinder and change directory:

 git clone https://github.com/ZhaozzReal/IPAFinder.git
 cd IPAFinder

Usage

IPAFinder has three sub-commands:

1.IPAFinder_GetAnno.py: Generate annotation file containing intron and exon information

2.IPAFinder_DetectIPA.py: Detect and quantify IPA sites, and calculate read counts of all exons

3.Infer_DUIPA.R: Infer differential usage of IPA sites

Generate specialized annotation file

RefSeq GTF file could be downloaded from the UCSC website: https://genome.ucsc.edu/.

The UCSC tool gtfToGenePred is required here.

Command

python IPAFinder_GetAnno.py -gtf /path/to/hg38refGene.gtf -output /path/to/IPAFinder_anno_hg38.txt

We have generated annotation file for hg19, hg38 and mm10, and we suggest users utilize it directly.

Detect IPA sites and quantify their usage, and calculate read counts of all exons

Command

python IPAFinder_DetectIPA.py -b /path/to/allbamfiles.txt -anno /path/to/IPAFinder_anno_hg38.txt -p 10 -o /path/to/IPAFinder_IPUI.txt

allbamfiles.txt contains all filename of bamfile between two conditions, as shown below:

condition1=/path/to/ctrl1.bam,/path/to/ctrl2.bam 
condition2=/path/to/case1.bam,/path/to/case2.bam

Following counting reads mapped to all exons, IPAFinder expects the results to be located inside its own sub-directory. For example, new generated results may appear with the following directory structure:

project/
  |-- ctrl1_exoncount.txt
  |-- ctrl2_exoncount.txt
  |-- case1_exoncount.txt
  |-- case2_exoncount.txt

Infer differential usage of IPA sites

DEXSeq, which is widely used for differential exon usage analysis on RNA-seq data, was applied to detect differential usage of IPA sites. This statistical framework could account for biological variability between replicates and is robust to changes in isoform abundance between conditions.

Command

Rscript Infer_DUIPA.R -b /path/to/allbamfiles.txt -I /path/to/IPAFinder_IPUI.txt -d /path/to/project -o /path/to/IPAFinder_DUIPA.txt

Final results will be saved in the file IPAFinder_DUIPA.txt.

The final output format is as follows:

Column Description
SYMBOL gene symbol
Intron_rank rank number of intron contains IpA event
Terminal_exon genomic location of corresponding terminal exon of IpA isoform
IPAtype type of terminal exon (Skipped or Composite)
ctrl1 IPUI estimate for sample ctrl1
ctrl2 IPUI estimate for sample ctrl2
case1 IPUI estimate for sample case1
case2 IPUI estimate for sample case2
IPUI_diff difference of mean IPUI between conditions
pvalue P value for testing differential usage of terminal exon
padj adjusted P value for testing differential usage of terminal exon
change Direction of changed IpA event (UP, DOWN, or NOT)

IPAFinder analysis on paired samples without replicates

Option 1: Infer differentially used IPA sites using Fisher's exact test-based method

python IPAFinder_PS_FET.py -b1 /path/to/ctrl.bam -b2 /path/to/case.bam -anno /path/to/IPAFinder_anno_hg38.txt -p 10 -o /path/to/IPAFinder_DUIPA.txt

Option 2: Infer differentially used IPA sites using bootstrapping-based method

python IPAFinder_PS_FDR.py -b1 /path/to/ctrl.bam -b2 /path/to/case.bam -anno /path/to/IPAFinder_anno_hg38.txt -p 10 -o /path/to/IPAFinder_DUIPA.txt

ipafinder's People

Contributors

zhaozzreal avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.