I am using the dev version of projpred and getting st

Strange results with binomial model in dev version about projpred HOT 13 CLOSED

stan-dev commented on June 4, 2024

Strange results with binomial model in dev version

from projpred.

Comments (13)

AlejandroCatalina commented on June 4, 2024 1

Alright, now everything is working the same in @develop as well. The last commit addresses this issue (I checked your model manually @andymilne). I am as well thinking on allowing the user to pass a custom exploration function for the search, I'll see if there's a nice way to build the function internally as much as possible.

from projpred.

AlejandroCatalina commented on June 4, 2024

The plots indicate that there must be something fishy in the projections. The output you get from cv_varsel indicates that the reference model may be ill behaved, given that some log likelihoods are infinite. The very large standard deviations indicate that the projections are a bit too vague, what is your reference model?

from projpred.

andymilne commented on June 4, 2024

The reference model is my own (not sure if that's what you are asking) -- it is described in my post, but I attach the full summary for clarity:

Model Info:
 function:     stan_glm
 family:       binomial [logit]
 formula:      cbind(tap, n_pulses - tap) ~ cue * (tap_l_1 + presentation_sc + 
	   repetition_sc + N_sc + K_sc + mean_IOI_sc + evenness_sc + 
	   all_ent_sc + step_ent_sc + SQ_sc + CQ_sc + balance_sc + is_high_prime + 
	   reg_exp_non_zero_sc + bal_wt_sc + height_sc + deja_vu2_sc + 
	   APM_sc + edge_sc)
 algorithm:    sampling
 sample:       4000 (posterior sample size)
 priors:       see help('prior_summary')
 observations: 48282
 predictors:   40

It fits relatively quickly, with good n_effs and Rhats and the bayes_R2 is 0.92. The pp_check is a bit off but that is because I am not including random effects in the ref model, and ideally the family would be betabinomial (the fit to the data when using both of those is very good).

from projpred.

AlejandroCatalina commented on June 4, 2024

Yeah, I was referring to the reference model's posterior.

Given that it does not include group effects the default search method would be l1 search, so you can try varsel(fit, method="forward") and see if its output differs (it is usually a bit more accurate, even though it's more computationally demanding). Another thing to check is the loglikelihood of the reference model, so you can extract the refmodel object calling ref <- get_refmodel(fit) and take a look at ref$loglik, which, judging from cv_varsel's output, should contain some Inf or NaN. If that's the case, you can try narrower priors on your reference model (horseshoe?) and look that way. My experience with some binomial models is that sometimes if the model is too sure of some observation that can become problematic (to see this, inspecting ref$mu is usually helpful, that's the mean prediction from your reference model) for the projections, so you can also take a look at vsel$search_path$sub_fits, where vsel is the output from varsel or cv_varsel.

I'm sorry my answer is a bit exploratory, I'm trying to give you detailed feedback based on what I've encountered when working with these kinds of models. I hope this helps in narrowing the issue.

from projpred.

andymilne commented on June 4, 2024

Thanks -- I will get back to you when I get a chance to try these.

from projpred.

jpiironen commented on June 4, 2024

If the varsel-plot is correct, it indicates that the reference model is really bad (worse than model without any variables). So I would definitely check that first. So estimate the performance of 1) the reference model and 2) the null model (no variables) on training data and using cross-validation (either LOO or k-fold) to understand what's going on with the reference model. This is separate from projpred.

If this check shows the reference model is actually good (at least better than the null model), then we can suspect that there's something wrong with projpred, but based that varsel-plot one cannot really say much about that.

from projpred.

andymilne commented on June 4, 2024

Juho -- the ref model is definitely good (see my comments to Alejandro above) and here's the loo comparison with the null (intercept only) model:

Model comparison based on LOO-CV: 
                         elpd_diff se_diff  
mdl_pluse_tap_ref_rstan        0.0       0.0
mdl_pluse_tap_null_rstan -128755.3     891.1

I am happy to make the rstanarm reference model available if useful.

Alejandro -- method = "forward" produces essentially identical results.

There are -Infs in the ref$loglik but no NaNs. I don't understand what that means or implies. I am refitting the model with a horseshoe prior but that is going very slowly (it's been running through the night and is only at 200 iterations -- when I use student(3, 0, 1) priors, the model completes in about 15 minutes). I don't understand the meaning of ref$mu and vsel$search_path$sub_fits so I don't know what to look for there.

from projpred.

AlejandroCatalina commented on June 4, 2024

I believe the issue lies in the projections due to the singular (multicollinear) variables. In a nutshell, I believe the reason this happens is because the inner method implemented in develop tries to achieve maximum likelihood estimates as close as possible to the true ones without adding regularization, which in cases like this helps stabilizing the estimates. To test this hypothesis, you can run the same model with master’s varsel method and see if the output is different. If the projections behave correctly it would point in this direction.

Regarding the reference model, it seems that the posterior would be quite vague (large uncertainty in the parameters due the singularities), and therefore the horseshoe prior should help the projections in removing some collinearities. While the reference model can still behave properly, the projections suffer much more, so I would expect the resulting reference model to behave better (it’s still a hypothesis tho).

I’m going to play around with this and it would be helpful to have this model if possible, I will try to simulate very correlated data myself as well so I have something to test. Thanks for the feedback!

from projpred.

andymilne commented on June 4, 2024

Yes, the results from the non-dev version of projpred look much more reasonable! I will email you a link to the model.

An unrelated question ... Given that, for this model, it is essential that any given term and its interaction must both be included, is there a way to enforce this in projpred or a principled workaround such as iteratively changing the penalty values in the call to varsel or cv_varsel?

from projpred.

AlejandroCatalina commented on June 4, 2024

Indeed, we have worked that out in the develop version and the included variables would make more sense from a modeling perspective. Always including main effects and interactions, group intercepts, main effect and group effect and so on. Thanks for testing my hypothesis, now I have a clear workflow :). Un saludo, Alejandro

…

On 12 Jun 2020, 5:50 AM +0300, Andrew Milne ***@***.***>, wrote: Yes, the results from the non-dev version of projpred look much more reasonable! I will email you a link to the model. An unrelated question ... Given that, for this model, it is essential that any given term and its interaction must both be included, is there a way to enforce this in projpred or a principled workaround such as iteratively changing the penalty values in the call to varsel or sv_varsel? — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

from projpred.

andymilne commented on June 4, 2024

With regard to interactions of the population terms, the most common scenario would be where the inclusion of any interaction with a given term requires all lower-order interactions of that term, and the term itself also to be included. E.g., if the ref model has X1*X2*X3, including X1:X2:X3, requires X1:X2, X1:X3, X2:X3, X1, X2, and X3 also to be included. This is required to ensure the predictors' coefficients are invariant to shifts in the predictors' values.

However, in some scenarios, additional inclusion "implications" can be important for interpretive purposes. For example, the ref model discussed here takes the form X1*(X2 + X3 + X4 + ... ) and my requirement would be that if any of the terms inside the bracket are included, the interaction with X1 should also be included. E.g., if X2 is included, X1:X2 should also be included. I am not sure how such custom implications could be best specified -- perhaps the user could add, as an argument, a list of pairs of variables, which mean that if the first variable is included by varsel/cv_varsel, the second must also be included. E.g., c("X2, X1:X2", "X3, X1:X3", "X4, X1:X4").

from projpred.

AlejandroCatalina commented on June 4, 2024

That’s an interesting proposal. Currently what we do is, at any given point, expanding the current projection in every possible direction that makes sense. I’m thinking such a function could be passed your the user. I will try to think about more user friendly ways of providing the required information for us to do the expansion internally asking only minimal input for the user.

from projpred.

jpiironen commented on June 4, 2024

Thanks for posting those cv-results. Looks sensible.

Yes, the results from the non-dev version of projpred look much more reasonable!

Ok, if this problem happens only with the dev-version, then Alejandro can help more. This should be a good case for debugging since, as far as I understand, one should get the same result with both versions (as it's just simple glm, no hierarchy or anything).

from projpred.

Strange results with binomial model in dev version about projpred HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent