Comments (25)
Ok, that's the example you promised above, it should do.
from projpred.
I'll check the difference between varsel
and cv_varsel
. Thanks for the feedback.
from projpred.
Any progress here? A re-submitted paper is on hold.
from projpred.
Could you share your fit object so I play with this? I checked the code and on paper the same code runs for both objects. There is one difference regarding the baseline used for the plot that I believe it's causing this difference but I would like to check with an actual object. I tested a toy classification problem locally and I got the same plot for both cv_varsel and varsel so I could not reproduce this bug.
from projpred.
The data are proprietary, so I had to do some obfuscation and will send these to you via email. I have been able to reproduce the plot problem with a simpler example using stan_glm
which anyway is the right model to use
library(bayesplot)
library(projpred)
library(rstan)
options(mc.cores = 2)
dtx = readRDS("dtx.rds") # These data will be sent by private mail
# This is the model I should have used, because there are not repeats per subject
# However, it fails to honour the structure, see
# https://discourse.mc-stan.org/t/projpred-interactions-without-marginal-terms/14893/11
fit_glm = stan_glm(r ~ (pre + bmi + age)*group,
family = binomial('logit'),
data = dtx, chains = 2,
iter = 10000, seed = 1123)
vs = varsel(fit_glm, verbose = FALSE)
vs_cv = cv_varsel(fit_glm, verbose = FALSE)
# First assumption is that
plot(vs, stats=c("auc", "acc"))
plot(vs_cv, stats=c("auc", "acc"))
# The acc plots look ok. One could argue that only the point at 0 and should
# be 0.5, but to me it looks like the other point are also incorrectly scaled.
####### I would prefer when the above would work correctly by honouring
####### the formula structure, and the auc would be correct.
# The following are just trial-and-error attemtps with many
# singular cases.
# Because model structure is not taken into account with stan_glm,
# I used a dummy (1|id) and stan_glmer.
# In glmer/lme4, this would fail with good reasons.
if (FALSE){
fit_r = stan_glmer(r ~ (pre + bmi +age)*group + (1|id),
family = binomial('logit'),
data = dtx, chains = 2,
iter = 10000, seed = 1123)
vs = varsel(fit_r, verbose = FALSE)
# This gives warnings, but the result looks right and takes
# into
summary(vs)
plot(vs, stats="auc")
vs_cv = cv_varsel(fit_r, verbose = FALSE)
plot(vs_cv, stats="auc")
}
Correct for varsel
Incorrect scaling of auc for cv_varsel
(probably ok for acc)
from projpred.
from projpred.
I believe that the pure prior value a 0 should not be fixed, but also be simulated and have error bars.
from projpred.
I've had a bit of time to inspect this.
I checked that indeed the AUC plot is different in cv_varsel compared to varsel. Per my investigation, this is expected when you think about it because varsel predicts a constant probability for all data points, which results in the desired 0.5 AUC. However, when using LOOCV for projpred, the prediction of each data point i is made with a model where the ith observation is removed, and therefore the intercept is slightly different. This has a big impact in the AUC computation because we need to “plot” the ROC curve and compute the area under it by ordering the predictions according to decreasing probability. Since the intercept is slightly larger for models where we have removed a data point of class 1, this ordering results in all data points of class 0 appearing first in the ordered predictions vector. The ROC for this data is just 0 until the FPR gets to 1, and then it shoots up to 1. It just happens that the AUC for this ROC is 0. This makes sense to me, but it’s also true that you can simply reverse the sign of the predictions (or say the probability is equal to 1 - current prediction) to have the opposite behaviour, now all data points of class 1 appear first as true positives, and therefore the AUC under this model would be 1 (which does not mean it’s good as the accuracy is bad anyway because the prediction is almost constant).
This is a well known limitation of CV in extreme cases corresponding to non-smooth utility functions (AUC for instance). A reference on this issue is Celisse & Arlot (2010) [5.2.1].
This came up earlier during projpred development and the development team decided against doing anything in particular to address this, so I'm closing the issue and leaving this answer as explanation. We will probably document these cases better.
from projpred.
This should be very carefully documented, because it will cause confusion with terms commonly used in binary decision, and a "reviewer hardened" reference without too much math must be available. AUC of 0 is considered perfect information in the community.
However, would you agree with the following:
- Point at 0 is degenerate. So why not leave it out totally, and document it? Showing a degenerate case does not sound reasonable to me.
- Are the other points comparable? To explain to end-users with clinical background only, is varsel in-sample and cv_varsel out-of-sample?
- Is there any chance that https://discourse.mc-stan.org/t/projpred-interactions-without-marginal-terms/14893/11 could be resolved, i.e. varsel(stan_glm) follow the behavior of stan_glmer?
As an aside: it would be nice if plot.varsel were a real ggplot object. I expected that on could scale axis etc with ggplot tools, but have not been successful.
from projpred.
- Point 0 is not wrong, is just has a different interpretation when you do LOO-CV. This behaviour is different if you use KFold-CV or stratified Leave-Two-Out. We have not implemente LTO-CV but you can play with KFold.
- You will get AUC = 1 if you reverse the predictions, of course, but that does not indicate perfect predictions anyway, because the same LOO-CV effect applies to the reverse.
- Every submodel is correctly computed, just be careful with the intercept-only model (point 0). As you correctly point out, varsel evaluates in-sample whereas cv_varsel shows out-of-sample performance.
- You can pass a custom
search_terms
list tovarsel
to enforce that interactions are only added after evaluating the main effects. - As far as I know, plot returns a ggplot2 object?
from projpred.
- K-fold works - I know that the prediction is bad, but that's the important part I want to document. I am a bit surprised that 0 variables are almost as good as all variables (0.4 being equivalent to 0.6 in decision terms). Larger error bars would have made this palatable.
- K-fold gives many warnings which look more like a programming problem than some convergence warnings:
1: In cbind(y, mu, weights) :
number of rows of result is not a multiple of vector length (arg 1)
2: In cbind(y, mu.bs, weights) :
number of rows of result is not a multiple of vector length (arg 1)
3: In cbind(y, mu, weights) :
number of rows of result is not a multiple of vector length (arg 3)
- An example using
search_terms
would be helpful, it is unclear how to use it. After reading @avehtari 's comment on https://discourse.mc-stan.org/t/projpred-interactions-without-marginal-terms/14893/11?u=denne, I thought that this was simply not implemented.
About ggplot2: I will check more carefully and make a separate post of it.
from projpred.
from projpred.
Develop branch give only one presumable uncritical warning:
Quick-TRANSfer stage steps exceeded maximum (= 500000)
. Some explanation what this means would be nice.
Stangely, the result looks much better than k-fold in the CRAN/master version:
from projpred.
Sorry for the comment about ggplot2.
Typical error: I had used %>% instead of + to add ylim(0,1). I know that hadley has regretted that inconsistency many times, but such is history.
from projpred.
from projpred.
Looks like this is a warning in classical stats/kmeans
https://stackoverflow.com/questions/21382681/kmeans-quick-transfer-stage-steps-exceeded-maximum
from projpred.
from projpred.
Maybe add this to the docs (please check if it works for you, it could also be some problem with different random generatords):
vs_cv = cv_varsel(fit_glm, verbose = FALSE, cv_method="kfold", algorithm="Lloyd")
from projpred.
The default setting produce quite variable results on multiple runs. I will have to check how to make results more stable.
Looking ahead to searchterms`.
Or maybe make stan_glm
behave like stan_glmer? That would be far more consistent.
from projpred.
We can make projpred internally adjust the search space for specific terms like interactions in this case, but glms are very fundamentally different to glmms so I'm not exactly sure what you mean by make glm behave like glmer?
from projpred.
Stangely, the result looks much better than k-fold in the CRAN/master version:
Thanks for reporting this. It's good to know that the new defaults have visible effect also for others than just for me :)
This should be very carefully documented, because it will cause confusion with terms commonly used in binary decision
As Alejandro mentioned we had seen this case, but as it was artificially created case and no-one had reported seeing this, and there was no easy fix, we did nothing. In all these years, you are the first one to see this outside of simulated examples. Now re-thinking this, we could add warnings when ACC or AUC are clearly below random chance and mention the inherent unstability of those utilities.
from projpred.
you mean by make glm behave like glmer
What Bill Venables hammered into our brains when we all were younger: using interactions before main effects is justified only in few cases, and doing this in a some forward/backward way jumps between incompatible models.
stepAIC
at that time did and still does it right, whatever you think about AIC as a criterion.
That's not different between glm and glmer (or their stan-versions), even if the algorithms are different. My surrogate (1|id) when there is not real id-repeat is painful to me.
from projpred.
from projpred.
Any progress in glm honoring model structure as glmer does? Reviewers are waiting for a re-submission.
from projpred.
Not on our side. You can achieve this behaviour by manually setting the search_terms
argument as follows.
For every interaction x:y
you include in the model, you add the following terms to search_terms <- c(..., "x", "y", "x + y + x:y", ...)
, so that x:y
is only considered after including both x
and y
. Note that search_tems
has to explicitly include the intercept 1
, so a full example would be search_terms <- c("1", "x", "y", "x + y + x:y")
.
We are working on several projects at the moment improving projpred in several directions and I'm afraid we won't have time to implement this as a whole feature in the short term.
from projpred.
Related Issues (20)
- Memory issues in multilevel models HOT 2
- Predictive performance gap / jumpy behavior at full size in Gaussian multilevel model HOT 1
- For a stan_glm model cv_varsel with loo works, but kfold gives an error HOT 4
- plot.vsel shows extra xtick label HOT 1
- brmsfit cumulative model & Latent projection predictive feature selection HOT 3
- Convergence checks for additive submodels HOT 4
- Allow dispersion parameter to be observation-specific HOT 1
- Smoothing of cross-validated predictive performance HOT 1
- Add R2 as performance statistic HOT 4
- Error when getting reference model with k-fold cross validation for cv_varsel HOT 9
- Truncated response distributions
- Clarify doc for refit_prj HOT 2
- Don't warn about slightly large khats HOT 1
- Remove QR=TRUE from the vignette
- Guide on latent_ll_oscale for other families e.g. zero-inflated negative binomial, hurdle negbinom HOT 5
- Using projpred in multi-response models with group correlated hierarchical effects HOT 3
- negbin latent projection in latent vignette fails with parallel HOT 1
- Request : Monotonic effect HOT 1
- Document global options + improve option names HOT 1
- Missing exports of S3 methods
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from projpred.