Giter Club home page Giter Club logo

ashr's People

Contributors

aksarkar avatar daichaoxing avatar dcgerard avatar esterpantaleo avatar eweine avatar heejungshim avatar lsun avatar mengyin avatar nanxstats avatar pcarbo avatar stephens999 avatar willwerscheid avatar zouyuxin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ashr's Issues

VB initialization

the VB (non-default) initialization appears not to work - throws an error when I try to use pi.init.
OK when I use pi.init=NULL (which sets pi.init to be all 1s) but not clear why other values don't work.

posterior median

would be nice to implement something to compute posterior medians

CRAN to-do items

  1. Create tag for v2.0.5 on cran branch.

  2. Merge cran branch into master.

Should we export `autoselect.mixsd`?

Currently for mr-ash via varbvs I have to use ashr:::autoselect.mixsd (three :) because the function autoselect.mixsd is not exported. It is not a big deal to me but if there is no obvious harm to ashr perhaps we should export it to avoid the ugly ::: syntax in other applications.

call returned incorrect

if you call ash.workhorse via ash (as usual) then the call returned is the call to ash.workhorse not to ash.

poisson likelihood with different priors

@mengyin I got errors with these two cases:

Case 1:
set.seed(100)
l = rexp(1)
x = rpois(100,lambd=l)
ashr::ash(rep(0,100),1,lik=ashr::lik_pois(x), mode=0, prior = "+uniform")

Case 2:
set.seed(100)
l = rexp(1)
x = rpois(100,lambd=l)
ashr::ash(rep(0,100),1,lik=ashr::lik_pois(x), mode= "estimate", prior = "halfuniform")

Both returned the error message: "Error in match.arg(prior) : 'arg' should be one of “nullbiased”, “uniform”, “unit”". And when prior = "uniform", the code runs okay with no errors. Thanks so much!

diagnostic plot

would be good to produce a diagnostic plot.
For each element betahat_j compute where it falls (ie its quantile) in its predictive distribution
p( betahat_j | ghat, s_j)

If the model fits well this should be uniform, so do a qq-plot against uniform quantiles.

This may help diagnose dependence among tests

speed up the mode estimation procedure?

The mode estimation procedure might be too slow especially when we need search the optimal mode in a big range. E.g. in Joyce's Poisson count (singlecell data) application, the counts range is (0,900) so it takes pretty long time to estimate the mode.

We guess one possible way to fix the issue is to start search with a better starting point? However, current ashn() uses stats::optimize function, which doesn't allow user to provide starting point...

rename likelihood functions

we should rename all the likelihoods lik_xxx instead of xxx_lik. This would allow a user to easily see all the likelihoods available by typing lik_ and tab to complete.

document new output options in general branch

also the format is rather unfriendly right now - need to work on this.
For example:
ash(rnorm(100),1,outputlevel=list(resfns = c(ashr:::output_pm)))$res

maybe we need an exported function that creates the output resfns?

ash does not return the right set of returns?

Dear Dr. Stephens,

I tried with great interest the ashr package. I did not install REBayes package, yet only got a warning.

But when I follow the example of the ash function, I only got the following. The values of interest were not returned. Is there anything that I missed here? Is REBayes a requirement for the correct returns? Thank you very much!

 beta = c(rep(0,100),rnorm(100))
 sebetahat = abs(rnorm(200,0,1))
 betahat = rnorm(200,beta,sebetahat)
 beta.ash = ash(betahat, sebetahat)

Due to absence of package REBayes, switching to EM algorithm
Warning message:
In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, :
there is no package called ‘REBayes’

$pi
[1] 5.120168e-01 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
[6] 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
[11] 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 2.627738e-07
[16] 3.354781e-02 4.983565e-06 1.229349e-12 1.423635e-16 0.000000e+00
[21] 1.336484e-12 1.235669e-09 3.348131e-01 1.196170e-01 3.706764e-15
[26] 3.399726e-17 0.000000e+00 0.000000e+00 3.554808e-08

$a
[1] 0.000000000 -0.001030045 -0.001456703 -0.002060090 -0.002913407
[6] -0.004120179 -0.005826813 -0.008240358 -0.011653626 -0.016480717
[11] -0.023307253 -0.032961433 -0.046614506 -0.065922866 -0.093229012
[16] -0.131845733 -0.186458023 -0.263691465 -0.372916046 -0.527382930
[21] -0.745832093 -1.054765861 -1.491664186 -2.109531722 -2.983328371
[26] -4.219063444 -5.966656743 -8.438126887 -11.933313485

$b
[1] 0.000000000 0.001030045 0.001456703 0.002060090 0.002913407
[6] 0.004120179 0.005826813 0.008240358 0.011653626 0.016480717
[11] 0.023307253 0.032961433 0.046614506 0.065922866 0.093229012
[16] 0.131845733 0.186458023 0.263691465 0.372916046 0.527382930
[21] 0.745832093 1.054765861 1.491664186 2.109531722 2.983328371
[26] 4.219063444 5.966656743 8.438126887 11.933313485

attr(,"row.names")
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
[26] 26 27 28 29
attr(,"class")
[1] "unimix"

check that not prior <1

prior values <1 can give infinite penalty, so should be avoided. we should probably enforce this for the user.

Version issue with ashr_2.1-25

Hi there,

I was looking to use vashr, but am running into a version problem with ashr and I think it was when comppostprob was changed to comp_postprob, which take different parameters. I tried to trouble-shoot a couple of things in vash.R but could you let me know for:

 postpi.se[completeobs,] = t(comppostprob(mix.fit$g,rep(numeric(0),n),sehat[completeobs],df))

What would be the equivalent call to comp_postprob? Further details are provided in the original issue:

mengyin/vashr#2

Sorry for the cross-post - it looks like this repo is maintained frequently.

+uniform option: posteriormean issue

betahat = rnorm(100,0,0.5)
sebeta = 0.6
test.ash = ash(betahat = betahat,sebetahat = sebeta, mixcompdist = "+uniform")
test.ash$fitted.g
plot(test.ash$PosteriorMean)

fitted.g provide right result we want, but the posteriormean has tiny non zero and negative values.

PosteriorMean can be infinity?

When I tried ash with "df=..." option on this example (data here: https://www.dropbox.com/s/d2n9lxzxtir95nh/eg.Rdata?dl=0), the "PosteriorMean" in ash object included a few infinities. The corresponding betahat's were indeed quite large (compared to the other betahat's, yet the absolute values were no more than 2), but I guess it's still weird that their posterior means are infinities.

load("eg.Rdata")
test=ash(betahat,sebetahat,df=df)
sum(test$PosteriorMean==Inf)
[1] 3

If we don't use the "df=..." option, then the posterior means will be fine. I guess there might be some numerical issues when computing the mean of truncated t distribution?

bug with modify.sd0

@MuzheZeng try this
set.seed(100)
z = rnorm(1000,0,2)
ashr::ash(z1,1,"uniform",gridmult=1.1)

I get

Error in temp[sdzero] = ifelse(A[sdzero] <= M[sdzero] & B[sdzero] >= M[sdzero], :
NAs are not allowed in subscripted assignments

installing package ashr

Hello,
I tried installing the package as shown in the ReadMe
library(devtools)
install_github("stephens999/ashr")
But it fails to install it.
can you please help me?
Thank you
Ronen

optmethod returned incorrectly if switch is done due to negative values

very occasionally mixIP returns negative estimates for pi.
Then we switch to optimization by EM.
But in that case the returned ashobject $optmethod is still mixIP.

I've made it so in future the fit object will also return optmethod, and that will be correct.
Do we also need optmethod in the ash object? (the fit object is only returned if outputlevel =4,
so not by default)

Skipifnotinstalled

Should use this functionality in tests instead of checking with require

warnings when compiling

MixSquarem.cpp:55:12: warning: unused variable 'kr' [-Wunused-variable]
double kr=control["kr"];
^
MixSquarem.cpp:66:20: warning: unused variable 'leval' [-Wunused-variable]
int iter,feval,leval;
^
MixSquarem.cpp:212:9: warning: unused variable 'n' [-Wunused-variable]
int n=matrix_lik.nrow(), k=matrix_lik.ncol(),niter;

general ash

@mengyin can you take a look at the code I've got in my general branch?
I think it incorporates much, but not yet all, of what you had in your branch, but without all the repetition.
It seems to run OK.
There are still some things to do, but thought it would be useful for you to look and
see if, for example, the results for logF errors look similar to yours.
(note that for now I think df1 and df2 have to be the same for all observations...; changing this
should not be too hard, but not done yet)

To see how to run it, see my test in tests/testthat/test_logF.R
set.seed(10)
e = rnorm(100)+log(rf(100,df1=10,df2=10))
e.ash = ash(e,1,lik=logF_lik(df1=10,df2=10))
s = sd(log(rf(10000,10,10))) # get approximate standard deviation by simulation
e.ash.norm = ash(e,s)
expect_equal(e.ash.norm$PosteriorMean,e.ash$PosteriorMean,tol=0.05)

and also the examples under ?ash.workhorse

speeding output computation for big problems

the current computation of output is nice and modular, but also much slower
than it needs to be. Possible speeds ups would involve rearranging so that

i) we only compute the posterior component probabilities W once (an n by K matrix)
ii) then we compute the posterior quantities we want P (eg lfsr etc) for each component [each an n by K matrix]
iii) then we compile them by rowsums(W*P)

Also, in the case of excluded observations, it would be best to exclude all of them upfront,
so
-first remove the data that are to be excluded (if any)

  • after iii) add back in the values for excluded observations

Warnings when using ash(..., df=..., ...)?

There are such warnings when using the "df=..." option of function ash().

test=ash(betahat=rnorm(100,0,1),sebetahat=rep(1,100),df=50)
Warning messages:
1: In if (a == b) { :
the condition has length > 1 and only the first element will be used
2: In if (a == b) { :
the condition has length > 1 and only the first element will be used
3: In if (a == b) { :
the condition has length > 1 and only the first element will be used

Problems with small sebetahat

Description

I seem to be getting new errors when the standard deviations are small. However, they aren't really all that small.

Example Data and Error

library(ashr)
betahat   <- c(-2.2, -1.2, 1.3, -1.8, 1.5, 0.9, 1.6, -2, 0.6, 1.1)
sebetahat <- c(0.65, 0.95, 0.32, 0.41, 0.1, 0.17, 0.59, 0.49, 0.51, 0.47)
aout      <- ashr::ash.workhorse(betahat = betahat, sebetahat = sebetahat)
## Warning in min(f): no non-missing arguments to min; returning Inf

## Error in t(A) %*% (f * d): non-conformable arguments

Error goes away for much larger sd

aout <- ashr::ash.workhorse(betahat = betahat, sebetahat = sebetahat + 1)

Generally, problem seems to be small SD

set.seed(4235)
aout <- ashr::ash.workhorse(betahat = rnorm(5), sebetahat = rep(0.5, 5))
## Warning in min(f): no non-missing arguments to min; returning Inf

## Error in t(A) %*% (f * d): non-conformable arguments
sessionInfo()
## R version 3.3.2 (2016-10-31)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 14.04.5 LTS
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] ashr_2.0.4
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.8       knitr_1.15        magrittr_1.5     
##  [4] REBayes_0.73      MASS_7.3-45       doParallel_1.0.10
##  [7] pscl_1.4.9        SQUAREM_2016.8-2  lattice_0.20-34  
## [10] foreach_1.4.3     stringr_1.1.0     tools_3.3.2      
## [13] parallel_3.3.2    grid_3.3.2        htmltools_0.3.5  
## [16] iterators_1.0.8   yaml_2.1.14       assertthat_0.1   
## [19] digest_0.6.10     tibble_1.2        Matrix_1.2-7.1   
## [22] codetools_0.2-15  evaluate_0.10     rmarkdown_1.1    
## [25] stringi_1.1.2     Rmosek_7.1.3      truncnorm_1.0-7

CIs in general branch

this option is broken by the changes in the general branch. Again, needs fixing before merging into master. need to add some basic tests for this option too.

Mosek documentation updates

  • Fix links to the Mac-specific and Linux-specific instructions for installing mosek and rmosek.
  • Post warning about expiry of mosek license (it is only valid for 1 year).

Interior point method sometimes fails when given small variances and uniform mixcompdist.

Example dataset:

betahat <- c(0.7775, 0.6091, 0.6880, 0.6897, 0.6565, 0.7505,
             0.7125, 0.7201, 0.7498, 0.7553)
sebetahat <- rep(0.01, length = length(betahat))
ip_out <- ashr::ash(betahat = betahat, sebetahat = sebetahat,
                    mixcompdist = "uniform", optmethod = "mixIP")
ip_out$PosteriorMean
min(ip_out$fitted.g$pi)

EM works fine:

em_out <- ashr::ash(betahat = betahat, sebetahat = sebetahat,
                    mixcompdist = "uniform", optmethod = "mixEM")
em_out$PosteriorMean
summary(em_out$fitted.g$pi)

KWDual outputs a negative mixture weight

Here is the example that produces a negative mixture weight with optmethod = mixIP only:

library(ashr)
betahat <- c(-5.8336751,1.0585307,0.8701328,0.9644079,-1.0796877,-7.7252394,-2.0433250,42.5216280,3.4243654)
sebeta  <- c(0.5003549,0.4223690,0.3462631,0.3316413,0.4076427,0.5276066,0.4302293,0.2051576,0.4270015)
out1    <- ash(betahat,sebeta,method = "fdr",mixcompdist = "normal",optmethod = "mixEM")
out2    <- ash(betahat,sebeta,method = "fdr",mixcompdist = "normal",optmethod = "mixIP")

The ad hoc solution right now is to set all negative mixture weights to 0 (or a small value near 0), and re-normalize the mixture weights. It would be good to come up with a better solution at some point.

Thanks to Wei for identifying this issue.

Dealing with infinity values in standard errors

seems like in certain (maybe very rare) applications some SEs are inifnite, which would result in a breakdown of the squarem algorithm. Simple solution is to just ignore these cases in ash, as with NA values.

Request to Export Functions

I use the following functions in my vicar package:

calc_lfdr, calc_lfsr, calc_np, calc_pm, calc_pp, calc_psd, calc_qvalue, calc_svalue, and estimate_mixprop.

Since these functions aren't exported in ashr (the general branch), my CRAN checks throw me a note. Could you export these functions? It's not a big deal if you prefer not to, though, as I can just copy and paste the functions for my package.

document new output options in general branch

also the format is rather unfriendly right now - need to work on this.
For example:
ash(rnorm(100),1,outputlevel=list(resfns = c(ashr:::output_pm)))$res

maybe we need an exported function that creates the output resfns?

Lots of warnings from `my_e2trunct`.

I am getting a lot of warnings when I use the t-likelihood in ash with small df. I think the warnings come from my_e2trunct. It's already noted that it is unstable for df of 1 or 2, but in the dataset below it seems to be unstable for df of 6.

Example dataset:

options(warn = 2)
betahat <- c(-0.695, -0.497, -0.176, 0.115, -3.757, -0.752, 0.163, -0.308, 0.361)
sebetahat <- c(0.186, 0.221, 0.233, 0.436, 0.548, 0.307, 0.532, 0.997, 0.526)
ashr::ash.workhorse(betahat = betahat, sebetahat = sebetahat, df = 6) ## warnings
ashr::ash.workhorse(betahat = betahat, sebetahat = sebetahat, df = 7) ## works fine

issue with new wrapper ashn

in ashn, why is the default
alpha = seq(from = -abs(mean(betahat)), to = 2 * abs(mean(betahat))
seems like maybe should be
alpha = seq(from = -2*abs(mean(betahat)), to = 2 * abs(mean(betahat))

also in help
alpha and then consider the model betahat_j ~ g()
should be betahat_j-mu \sim g() ??

and finally would be more readable if we replace alpha with mu in help and function

IP optimization error from ashr "general" branch

Without the REBayes package, ash runs successfully (see ash/analysis/adaptive_shrinkage.Rmd):

R> beta.spiky.ash = ash(sim.spiky$input$betahat,s)                             
Due to absence of package REBayes, switching to EM algorithm

However, when I install the REBayes package, I get the following error:

R> install.packages("REBayes")
R> beta.spiky.ash = ash(sim.spiky$input$betahat,s)
Error in t(A) %*% (f * d) : non-conformable arguments

The error appears to be cropping up in a call to the KWDual function in the REBayes package. Has anyone gotten this error before? Any ideas why this may be occurring?

Note I get the same error in the "master" branch version. Also note that this works fine:
beta.spiky.ash = ash(sim.spiky$input$betahat,s,optmethod = "cxxMixSquarem")

numerical issue in comp_cdf_post.unimix in general branch

eg try:

cdf_post(m=unimix(1,-100,100),0,data=set_data(-2000,1,normal_lik(),alpha=1))
[1] 0.5
cdf_post(m=unimix(1,-100,100),0,data=set_data(-20,1,normal_lik(),alpha=1))
[1] 1

they should both be approximately 1.

The problem is exponentiating too early...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.