Giter Club home page Giter Club logo

deconvolutiontests.jl's People

Contributors

lfcampos avatar maximerischard avatar

Watchers

 avatar  avatar  avatar

deconvolutiontests.jl's Issues

Deconvolution Test vs KS test

Let's look at a situation where we have a standard test (in this case k.s. test) to compare our deconvolution decon.test() to it. This simulation is built to give assurance that in a situation where we know what to do, the deconvolution test does as good (if not better) than the standard test.

Assuming homoskedastic errors, the KS test is valid, the distributions of Y are the same under both groups. For the DGM

Y_i1~N(mu1, sigma1) + N(0, sigma^2)
Y_i2~N(mu2, sigma2) + N(0, sigma^2)

  • ks({Y_i1}, {Y_i2}) should have a rejection of 5% when mu1=mu2 and sigma1=sigma2.
  • decon.test({Y_i1}, {Y_i2}) should also have this.

But as we move into the alternative, i.e. mu1<mu2, keeping sigma1=sigma2, both tests will begin to reject H0 more often, eventually leading to 100% power when the separation between mu1, mu2 is large enough (relative to sigma1=sigma2 AND sigma).

It would be good to see a power profile for increasing separation for varying levels of noise sigma1=sigma2 as well as error noise sigma.

This can easily get out of hand, even running MLEs (like in my simple example) takes ~20 minutes to get smooth power profiles (2000 replicates). So this may have to wait until Issue 8 is resolved.

Investigate a simple alternative

Another addition to the "what's the harm?" Section. One is to run Testing on noisy data, the other is this.

Vinay and Paul Green seem convinced that one can simply deconvolve the data and conduct tests on the resulting distributions (ignoring the resampling of MEs). This is wrong because we won't have the correct Null distribution to test against. But what's the harm?

This needs to be investigated to see what the harm is. We can run a simulation where we simply deconvolve (using a few options even) and KS test directly. We can push this to extremes where the resulting statistics won't have KS distributions -- I can't think directly of situations when this is the case.

Comparison with KS test

When the errors are homoscedastic, the KS test is valid. Under that circumstance, how does the deconvolution+bootstrap+KS test compare to the traditional KS test? Is there a loss of power?

It would be straightforward to perform some simulations of this, but perhaps more interesting to dig into theory a little bit and try to understand the difference.

Simulation design

We need to think about what simulations to perform.
Some parameters include:

  • the distribution of sigma_x and sigma_y
  • the choice of F_X and F_Y
    • equal under the null
    • or different by location, scale, or other aspects of the distribution
  • n_X, n_Y
  • the test statistic
  • the deconvolution strategy:
    • Fourier
    • Efron
    • none (naive)

Think About Theory

What theoretical questions do we wish to ask and (hopefully) answer? What questions will people ask?

Some I can think of:

  • under what assumptions is the test valid or approximately valid?
    • for example do we need to make smoothness assumptions on the underlying distribution?
  • how does power change with increasing number of observations?
  • Can we characterize the asymptotic null distribution in some way?
    • do we care? maybe yes so we can say something about asymptotic power...

Incorporate the deconvolution error

The first step of our algorithm is the deconvolution of all the data (obtaining an estimate of F_0). My intuition was that the uncertainty in this deconvolution is not crucial. But is this correct? Can we make this intuition more precise?

It would be possible to incorporate the uncertainty in the deconvolution by first bootstrapping the original data (nonparametric bootstrap). What do we gain by doing so? Is it more valid? more robust? more powerful?

Are there other ways to handle this uncertainty?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.