Giter Club home page Giter Club logo

Comments (25)

dmenne avatar dmenne commented on June 4, 2024 1

Ok, that's the example you promised above, it should do.

from projpred.

AlejandroCatalina avatar AlejandroCatalina commented on June 4, 2024

I'll check the difference between varsel and cv_varsel. Thanks for the feedback.

from projpred.

dmenne avatar dmenne commented on June 4, 2024

Any progress here? A re-submitted paper is on hold.

from projpred.

AlejandroCatalina avatar AlejandroCatalina commented on June 4, 2024

Could you share your fit object so I play with this? I checked the code and on paper the same code runs for both objects. There is one difference regarding the baseline used for the plot that I believe it's causing this difference but I would like to check with an actual object. I tested a toy classification problem locally and I got the same plot for both cv_varsel and varsel so I could not reproduce this bug.

from projpred.

dmenne avatar dmenne commented on June 4, 2024

The data are proprietary, so I had to do some obfuscation and will send these to you via email. I have been able to reproduce the plot problem with a simpler example using stan_glm which anyway is the right model to use

library(bayesplot)
library(projpred)
library(rstan)
options(mc.cores = 2)

dtx = readRDS("dtx.rds") # These data will be sent by private mail

# This is the model I should have used, because there are not repeats per subject
# However, it fails to honour the structure, see
# https://discourse.mc-stan.org/t/projpred-interactions-without-marginal-terms/14893/11
fit_glm = stan_glm(r ~ (pre + bmi + age)*group,
                        family = binomial('logit'),
                        data = dtx, chains = 2,
                        iter = 10000, seed = 1123)
vs = varsel(fit_glm, verbose = FALSE)
vs_cv = cv_varsel(fit_glm, verbose = FALSE)

# First assumption is that
plot(vs, stats=c("auc", "acc"))
plot(vs_cv, stats=c("auc", "acc"))
# The acc plots look ok. One could argue that only the point at 0 and should
# be 0.5, but to me it looks like the other point are also incorrectly scaled.

####### I would prefer when the above would work correctly by honouring
####### the formula structure, and the auc would be correct.

# The following are just trial-and-error attemtps with many
# singular cases.

# Because model structure is not taken into account with stan_glm,
# I used a dummy (1|id) and stan_glmer.
# In glmer/lme4, this would fail with good reasons.

if (FALSE){
  fit_r = stan_glmer(r ~ (pre + bmi +age)*group + (1|id),
                          family = binomial('logit'),
                          data = dtx, chains = 2,
                          iter = 10000, seed = 1123)
  vs = varsel(fit_r, verbose = FALSE)
  # This gives warnings, but the result looks right and takes
  # into
  summary(vs)
  plot(vs, stats="auc")

  vs_cv = cv_varsel(fit_r, verbose = FALSE)
  plot(vs_cv, stats="auc")
}

Correct for varsel

Rplot01

Incorrect scaling of auc for cv_varsel (probably ok for acc)

Rplot

from projpred.

AlejandroCatalina avatar AlejandroCatalina commented on June 4, 2024

from projpred.

dmenne avatar dmenne commented on June 4, 2024

I believe that the pure prior value a 0 should not be fixed, but also be simulated and have error bars.

from projpred.

AlejandroCatalina avatar AlejandroCatalina commented on June 4, 2024

I've had a bit of time to inspect this.

I checked that indeed the AUC plot is different in cv_varsel compared to varsel. Per my investigation, this is expected when you think about it because varsel predicts a constant probability for all data points, which results in the desired 0.5 AUC. However, when using LOOCV for projpred, the prediction of each data point i is made with a model where the ith observation is removed, and therefore the intercept is slightly different. This has a big impact in the AUC computation because we need to “plot” the ROC curve and compute the area under it by ordering the predictions according to decreasing probability. Since the intercept is slightly larger for models where we have removed a data point of class 1, this ordering results in all data points of class 0 appearing first in the ordered predictions vector. The ROC for this data is just 0 until the FPR gets to 1, and then it shoots up to 1. It just happens that the AUC for this ROC is 0. This makes sense to me, but it’s also true that you can simply reverse the sign of the predictions (or say the probability is equal to 1 - current prediction) to have the opposite behaviour, now all data points of class 1 appear first as true positives, and therefore the AUC under this model would be 1 (which does not mean it’s good as the accuracy is bad anyway because the prediction is almost constant).

This is a well known limitation of CV in extreme cases corresponding to non-smooth utility functions (AUC for instance). A reference on this issue is Celisse & Arlot (2010) [5.2.1].

This came up earlier during projpred development and the development team decided against doing anything in particular to address this, so I'm closing the issue and leaving this answer as explanation. We will probably document these cases better.

from projpred.

dmenne avatar dmenne commented on June 4, 2024

This should be very carefully documented, because it will cause confusion with terms commonly used in binary decision, and a "reviewer hardened" reference without too much math must be available. AUC of 0 is considered perfect information in the community.

However, would you agree with the following:

  • Point at 0 is degenerate. So why not leave it out totally, and document it? Showing a degenerate case does not sound reasonable to me.
  • Are the other points comparable? To explain to end-users with clinical background only, is varsel in-sample and cv_varsel out-of-sample?
  • Is there any chance that https://discourse.mc-stan.org/t/projpred-interactions-without-marginal-terms/14893/11 could be resolved, i.e. varsel(stan_glm) follow the behavior of stan_glmer?

As an aside: it would be nice if plot.varsel were a real ggplot object. I expected that on could scale axis etc with ggplot tools, but have not been successful.

from projpred.

AlejandroCatalina avatar AlejandroCatalina commented on June 4, 2024
  1. Point 0 is not wrong, is just has a different interpretation when you do LOO-CV. This behaviour is different if you use KFold-CV or stratified Leave-Two-Out. We have not implemente LTO-CV but you can play with KFold.
  2. You will get AUC = 1 if you reverse the predictions, of course, but that does not indicate perfect predictions anyway, because the same LOO-CV effect applies to the reverse.
  3. Every submodel is correctly computed, just be careful with the intercept-only model (point 0). As you correctly point out, varsel evaluates in-sample whereas cv_varsel shows out-of-sample performance.
  4. You can pass a custom search_terms list to varsel to enforce that interactions are only added after evaluating the main effects.
  5. As far as I know, plot returns a ggplot2 object?

from projpred.

dmenne avatar dmenne commented on June 4, 2024
  • K-fold works - I know that the prediction is bad, but that's the important part I want to document. I am a bit surprised that 0 variables are almost as good as all variables (0.4 being equivalent to 0.6 in decision terms). Larger error bars would have made this palatable.

Rplot02

  • K-fold gives many warnings which look more like a programming problem than some convergence warnings:
1: In cbind(y, mu, weights) :
  number of rows of result is not a multiple of vector length (arg 1)
2: In cbind(y, mu.bs, weights) :
  number of rows of result is not a multiple of vector length (arg 1)
3: In cbind(y, mu, weights) :
  number of rows of result is not a multiple of vector length (arg 3)

About ggplot2: I will check more carefully and make a separate post of it.

from projpred.

AlejandroCatalina avatar AlejandroCatalina commented on June 4, 2024

from projpred.

dmenne avatar dmenne commented on June 4, 2024

Develop branch give only one presumable uncritical warning:
Quick-TRANSfer stage steps exceeded maximum (= 500000). Some explanation what this means would be nice.

Stangely, the result looks much better than k-fold in the CRAN/master version:

Rplot03

from projpred.

dmenne avatar dmenne commented on June 4, 2024

Sorry for the comment about ggplot2.
Typical error: I had used %>% instead of + to add ylim(0,1). I know that hadley has regretted that inconsistency many times, but such is history.

from projpred.

AlejandroCatalina avatar AlejandroCatalina commented on June 4, 2024

from projpred.

dmenne avatar dmenne commented on June 4, 2024

Looks like this is a warning in classical stats/kmeans
https://stackoverflow.com/questions/21382681/kmeans-quick-transfer-stage-steps-exceeded-maximum

from projpred.

AlejandroCatalina avatar AlejandroCatalina commented on June 4, 2024

from projpred.

dmenne avatar dmenne commented on June 4, 2024

Maybe add this to the docs (please check if it works for you, it could also be some problem with different random generatords):

vs_cv = cv_varsel(fit_glm, verbose = FALSE, cv_method="kfold",   algorithm="Lloyd")

from projpred.

dmenne avatar dmenne commented on June 4, 2024

The default setting produce quite variable results on multiple runs. I will have to check how to make results more stable.

Looking ahead to searchterms`.

Or maybe make stan_glm behave like stan_glmer? That would be far more consistent.

from projpred.

AlejandroCatalina avatar AlejandroCatalina commented on June 4, 2024

We can make projpred internally adjust the search space for specific terms like interactions in this case, but glms are very fundamentally different to glmms so I'm not exactly sure what you mean by make glm behave like glmer?

from projpred.

avehtari avatar avehtari commented on June 4, 2024

Stangely, the result looks much better than k-fold in the CRAN/master version:

Thanks for reporting this. It's good to know that the new defaults have visible effect also for others than just for me :)

This should be very carefully documented, because it will cause confusion with terms commonly used in binary decision

As Alejandro mentioned we had seen this case, but as it was artificially created case and no-one had reported seeing this, and there was no easy fix, we did nothing. In all these years, you are the first one to see this outside of simulated examples. Now re-thinking this, we could add warnings when ACC or AUC are clearly below random chance and mention the inherent unstability of those utilities.

from projpred.

dmenne avatar dmenne commented on June 4, 2024

you mean by make glm behave like glmer

What Bill Venables hammered into our brains when we all were younger: using interactions before main effects is justified only in few cases, and doing this in a some forward/backward way jumps between incompatible models.
stepAIC at that time did and still does it right, whatever you think about AIC as a criterion.

That's not different between glm and glmer (or their stan-versions), even if the algorithms are different. My surrogate (1|id) when there is not real id-repeat is painful to me.

from projpred.

AlejandroCatalina avatar AlejandroCatalina commented on June 4, 2024

from projpred.

dmenne avatar dmenne commented on June 4, 2024

Any progress in glm honoring model structure as glmer does? Reviewers are waiting for a re-submission.

from projpred.

AlejandroCatalina avatar AlejandroCatalina commented on June 4, 2024

Not on our side. You can achieve this behaviour by manually setting the search_terms argument as follows.

For every interaction x:y you include in the model, you add the following terms to search_terms <- c(..., "x", "y", "x + y + x:y", ...), so that x:y is only considered after including both x and y. Note that search_tems has to explicitly include the intercept 1, so a full example would be search_terms <- c("1", "x", "y", "x + y + x:y").

We are working on several projects at the moment improving projpred in several directions and I'm afraid we won't have time to implement this as a whole feature in the short term.

from projpred.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.