stephenslab / dsc-log-fold-change Goto Github PK
View Code? Open in Web Editor NEWdsc to compare approaches to estimating/testing log-fold-change from counts
Home Page: https://stephenslab.github.io/dsc-log-fold-change/
dsc to compare approaches to estimating/testing log-fold-change from counts
Home Page: https://stephenslab.github.io/dsc-log-fold-change/
Subject: pipe_null
and pipe_power
Goal: simply the logic so that these two pipelines call the same score modules
Curent version
define:
data: data_poisthin_null, data_poisthin_power
method: t_test, wilcoxon
score: type_one_error, pval_adj, fdr, auc
run:
pipe_null: data_poisthin_null * method * type_one_error
pipe_power: data_poisthin_power * method * pval_adj * (fdr * auc)
This is the first idea I tried but I got errors...
define:
data: data_poisthin_null, data_poisthin_power
method: t_test, wilcoxon
score: type_one_error, pval_adj * (fdr, auc)
run:
pipe_null: data_poisthin_null * method * score
pipe_power: data_poisthin_power * method * score
I got this error: Error in smooth.spline(lambda, pi0, df = smooth.df)
This issue has been discussed on the qvalue
package GitHub site in multiple threads: StoreyLab/qvalue#9
StoreyLab/qvalue#13
qvalue
returns this error when the p-value distribution is truncated, i.e., not spanning the entire range of [0,1]. The authors of the qvalue
package offers an alternative function qvalue_truncp
to estimate qvalue
in this situation.
I opened this issue to remind myself to write an errorHandling function for truncated p-value distribution.
Need a better way of calling R packages and their dependencies. Now I call inside R scripts.
Perhaps we can make a conda script to keep track of all these packages. What do you think? @jdblischak
We would like the simulation function vary the following parameters. Next to each parameter is a link to relevant existing code:
Problem
: currently the module input is the total number of samples, then poisthin
function splits this into two groups of equal sample size.
Ideas
: add to poisthin
function a new parameter of the ratio of sample size
In dsc-log-fold-change/dsc/benchmark.dsc
, I'd to have in data_poisthin
a logical argument shuffle_sample
. This dsc module calls pois_thin function in the module folder. The syntax now only gives me one file. I expect two files: one when shuffle_sample=TRUE
and one when shuffle_sample=FALSE
. How do I do this? Thanks!
data_poisthin: R(counts = readRDS(dataFile)) + \
dataSimulate.R + \
R(set.seed(seed=seed); out = poisthin(mat=t(counts), nsamp=nsamp, ngene=ngene, gselect=gselect, shuffle_sample=shuffle_sample, signal_dist=signal_dist, prop_null = prop_null)) + \
R(groupInd = out$X[,2]; Y1 = t(out$Y[groupInd==1,]); Y2 = t(out$Y[groupInd==0,]))
dataFile: "data/pbmc_counts.rds"
seed: R{2:101}
nsamp: 90
ngene: 1000
prop_null: .5, .9, 1
shuffle_sample: T, F
gselect: "random"
signal_dist: "bignormal"
$Y1: Y1
$Y2: Y2
$beta: out$beta
@jhsiao999 I've got 2 comments for DESeq2
module:
You might want to add a @CONF: R_libs=DESeq2
or @CONF: R_libs = DESeq2, voom, ...
if there are more libraries used. That way the library not only gets automatically installed when others run it, but also gets loaded. Otherwise even your code as is would not work due to missing load lib statements.
Maybe consider putting the code under code
folder instead? See my get_data.R
example
Now you can make the changes and add your module to benchmark.dsc
and give it a go using --target get_data * run_DESeq2
:)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.