Giter Club home page Giter Club logo

tdfdr's Introduction

tdfdr

Two-dimensional false discovery rate control for powerful confounder adjustment in omics association studies

Overview

The package implements the two-dimensional false discovery rate control for powerful confounder adjustment in omics association analysis. The method is based on the idea that the confounder(s) usually affect part of the omics features, and thus adjusting the confounder(s) for ALL omics features will be over-adjustment, leading to reduced statistical power. The proposed procedure starts with performing the unadjusted analysis (first dimension - filtering) to narrow down the list of omics features that are more likely to be affected by either the confounder or the variable of interest or both. In the second dimension, we conduct confounder-adjusted analysis on these 'top' candidates, which are enriched in signals, to reduce multiple testing burden and increase the power. The method belongs to the general topic of using auxiliary data to increase the power of multiple testing, which has recently received tremendous research interest. In our case, the auxiliary data are the the unadjusted statistics, which could inform the probability of the null hypotheses being true. The difficulty here is to take into account the correlation between the auxiliary data (unadjusted statistics) and the main data (adjusted statistics). We provide a procedure that is theoretically guaranteed to control the false discovery rate while maximizing the power.

License

GPL version 2 or any later version

Installation

Install dependent packages "ggplot2", "reshape2", "doMC", "pbivnorm", "nspmix", "limma", "qvalue"

install.packages(c("ggplot2", "reshape2", "doMC", "pbivnorm", "nspmix"))
if (!requireNamespace("BiocManager", quietly = TRUE))
   install.packages("BiocManager")
BiocManager::install(c("limma", "qvalue"))
install.packages("devtools")
devtools::install_github("jchen1981/TDFDR/R_package")

An Example

We illustrate the usage of tdfdr package using simulated data.

 require(tdfdr)
 require(qvalue)
 source('https://raw.githubusercontent.com/jchen1981/TDFDR/master/Simulation/Code/SimFunc.R')

 
 # Simulate data
 data <- simulate.data(n = 100, p = 1000, conf.sig.cor = 1.25, dimZ = 2,
	    sig.density = 0.1, sig.strength.m = 0.4, sig.strength.sd = 0.2,
	    conf.density = 0.1, conf.strength.m = 0.4, conf.strength.sd = 0.2)
 x <- data$x
 z <- data$z
 y <- data$y
 truth <- data$truth
		
 # One-dimenisonal procedure - classic adjusted procedure
 obj1 <- summary(lm(y ~ x + z))
 pv1 <- sapply(obj1, function(x) x$coefficient[2, 4])
 qv1 <- qvalue(pv1)$qvalue
 pos1 <- qv1 <= 0.05

 # Two-dimensional procedure
 obj2 <- tdfdr(y, x, z, alpha = 0.05)
 pos2 <- obj2$pos

 # Compare the false discovery proportions
 sum(pos1 & !truth) / max(sum(pos1), 1)
 sum(pos2 & !truth) / max(sum(pos2), 1)
 
 # Compare the number of hits
 sum(pos1 & truth)
 sum(pos2 & truth)
  

Reproducibility

All the codes to reproduce the results in the manuscript are contained in 'Simulation' and 'RealData' folders. Due to file limit on github, we provide a dropbox link to download the relevant data.

tdfdr's People

Stargazers

Young Shoo avatar Qin Lin avatar

Watchers

Chen_LAB avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.