Giter Club home page Giter Club logo

aggregationde's Introduction

This repository contains the scripts and software for reproducing the results and figures of the paper "Gene-level differential analysis at transcript-level resolution" by Lynn Yi, Harold Pimentel, Nicolas L Bray and Lior Pachter. The code can also be used to apply the aggregation methods described in the paper to new datasets. The software in the repository was written by Lynn Yi.

R/Snakefile is an example pipeline for downloading fastq files, performing pseudoquant, and bootstraping on TCCs. The remainder processes for calling sleuth and aggregation p-values are performed in R scripts, tcc_pipeline.R and transcript_pipeline.R

R/aggregation.R contains logic for performing aggregation, incuding mapping TCCs to genes. R/tcc2bootstrap.R contains logic for performing bootstraps on TCCs and writing h5 files that sleuth can take as input.

The folders SRPXXXXX contain code for reproducing analysis for the two datasets in the paper. They include Snakefiles for read downloading and quantification, aggregation pipelines, and GO analysis. plot_transcripts.R include code for reproducing Figures 1 and 2 in paper. topGO.R include code for reproducing Figure 5 and performing GO analysis.

The folder simulation_pipeline contains code for reproducing the analyses of simulations described in the paper. pachterlab/sleuthpaperanalysis must be utilized first to create simulations. Then simulation_pipeline.R will run various differential expression and aggregation methods. averaging_fdrs.R and roc_curve.R will handle averaging FDR and sensitivities. Finally mamabear is invoked in mamabear.R to plot.

aggregationde's People

Contributors

lakigigar avatar lynnyi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aggregationde's Issues

Regarding input files for simulation pipeline.

I was trying to run your pipeline but you have said that sleuthpaperanalysis must be utilized first to create simulations. Can you help in producing the simulations from sleuth paper analysis, as we are having issues running the pipeline to produce them?

The R code and (by proxy) the Snakefiles reference a variety of input directories and files which do not exist (such as finn_samples.txt), and of which we have no samples that we could imitate. Analyzing the code to determine the required format for these files appears to be quite an undertaking. Since you've managed to run these steps and have the experience we're lacking, could you share some insight on the exact nature of these files?

some question about paper

I have read your paper, but there is a problem that I don't quite understand. The p value of genes is estimated by the p value of transcript difference analysis, and basemean in transcript difference analysis results is used as weight. If one gene id corresponds to multiple transcript ids, how to choose between basemean and p value? In addition, is there any difference between directly converting transcript id into gene id for GO enrichment analysis and gene enrichment analysis based on p-value estimation?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.