Comments (18)
Thanks for the suggestion. I totally agree this would be a good idea and that the current implementation is subobtimal. This is already on our todo-list, but unfortunately this will have to wait a bit since I currently have some more urgent things that have higher priority. But we should definitely do this before we start implementing variable selection for the hierarchical models (stan_glmer
for instance). In any case I think we will have init_refmodel
also in the future so that the users can perform projections onto linear models from an arbitrary reference model not in brms
or rstanarm
.
from projpred.
I totally understand. Once the S3 refactioring is done, I will start featuring projpred in brms by writing the related methods for brmsfit objects.
Maybe, if I find the time I just send you a PR making the functions S3 generic. It shouldn't be too complicated and just require some refactoring.
from projpred.
Feel free to contribute in terms of PR or anything else, we appreciate that. But even if you didn't have time to do this, I will eventually, there's just going to be some delay.
from projpred.
I have been looking at the internals of projpred and have some questions / comments.
It seems to me as if the primary task is to bring a fitted model object into the structure suggested by init_refmodel
. Once we have the refmodel
extracted, we can use it in any of the methods of projpred, such as varcel
etc. Is that correct?
Currently, the function to extract the refmodel from stanreg
objects is called .extract_vars
, which is confusing given that we have a default constructor named init_refmodel
. We should define a generic whose methods create refmodel
objects (or whatever we want to call it) either based on some fitted model objects or like init_refmodel
does it as the default method. The returned refmodel
object could then be used in the other methods.
@jpiironen (and @avehtari if you like) could we have a skype call / google hangout at some point where you explain to me the current structure of the package and I can make some suggestions?
from projpred.
from projpred.
That would work for me.
from projpred.
As an answer to Paul's question, yes, once the reference model is formed then it can be passed to any of the functions in projpred. Your suggestion definitely seems sensible, and I have been planning exactly the same thing myself. But we can discuss it in more detail later on.
from projpred.
I finally had time to do work on this. Now the program works so that each of the main functions (like varsel
) accepts any object as an input but they will be passed to an S3-generic get_refmodel
-function that initializes the refmodel
structure using init_refmodel
, and all the remaining computations are done using this object. Currently we have the get_refmodel
-function for stanreg
-objects but it should now be fairly easy to make projpred more compatible with other fit-objects, such as brmsfit
by writing a corresponding get_refmodel
function for it.
@paul-buerkner would you mind taking a quick look at the relevant parts of the code and see if it makes sense to you? The implementation can be found from the refmodel_predict-branch, and most of the relevant code is in refmodel.R
-file (you can check also for instance varsel
-function to see how those functions are invoked). I'm sure some further changes still need to be done but I think this is already a big step forward.
from projpred.
Thanks, Juho! Looks good to me so far! I will try to get a prototype of get_refmodel.brmsfit
working in the next one or two weeks.
One think that I saw in predict.refmodel
is the call to object$fam$ll_fun
. That one looks rather rstanarm-specific to me. How about saving ll_fun
directl in the refmodel
?
from projpred.
ll_fun
and other related family specific helpers are stored in the family object because some of them are needed in places which need not know anything about the reference model (see for instance glmfun.R
) and are thus kept separated on purpose. Now I'm not sure whether ll_fun
specifically would be something that could be erased from the family-object, but this would require some thought, so I think we'll keep it there for the time being.
I will think this more when I have time to tidy up the kl_helpers.R
which is a bit of a mess at the moment and makes it somewhat tricky to add new families. This might already become something that needs to be resolved in one way or another after making everything communicate with brms
as those family-objects are somewhat different from the traditional family-objects in R as far as I understood. But you will probably have some more ideas when you start implementing get_refmodel.brmsfit
.
from projpred.
I mean I can manually add the ll_fun
function to my family objects inside get_refmodel.brmsfit
. So this won't be a blocker.
I am just thinking of how to make it easy for packages in general to enable compatibility with projpred.
from projpred.
Right, now I see better what you mean. Well, currently the projpred supports only a small set of families and is not meant to be able to handle family objects outside that set. So basically you don't need to write the ll_fun
youself (projpred will do this), you simply need to pass in a standard R family-object that is currently supported (at the moment gaussian, binomial and poisson). But I also will need to think this more how to make everything as easily compatible to other packages as possible.
from projpred.
Ah I see. Sorry I missed the part of projpred preparing that automatically.
from projpred.
I am nearly finished with supporting brms in projpred, but have two questions before I submit a PR:
-
Is there a specific reason we can't just let
predfun
handle offsets itself? The reason I am asking
is because I need some special case coding in brms just to exclude offsets in the computation,
that I would love to avoid. -
It seems as if the projpred with family
binomial
requires probabilities instead of counts. Can
we also make it support counts? Again, the reason is similar as in 1) that I need to do some (actually
a little bit more) special case coding to ensure thatpredfun
will only return success probabilities.
from projpred.
We thought that it would be more convenient and less error prone if the user did not have to implement the offsets when coding predfun
for init_refmodel
. I thought that it should always be easier to make predictions without offsets but apparently you have some special case I didn't consider. Perhaps this could be changed but I would still guess that in most cases users would find the current solution easier.
The predfun
should always output expected values for the target variable (and not counts) since these are required in order to compute the projection using the current methodology. So unfortunately if I understand correctly what you mean I don't see any way of getting around this.
from projpred.
I have build a workaround to get offsets working in the way you currently require, but out of interest: Would you expect projpred to work correctly if predfun
handled offsets automatically if you pass the offset variable(s) via the data frame?
from projpred.
I think we can close this issue now. @jpiironen what do you think?
from projpred.
Yes I think so too. If some further complications arise (as is likely eventually), let's open a new issue. Closing this now.
from projpred.
Related Issues (20)
- Test Suite Crashes HOT 5
- variable selection for uncorrelated random variables HOT 3
- Incompatible with brms when random effects are specified with || syntax HOT 2
- Error from multivariate brms models HOT 6
- Memory issues in multilevel models HOT 2
- Predictive performance gap / jumpy behavior at full size in Gaussian multilevel model HOT 1
- For a stan_glm model cv_varsel with loo works, but kfold gives an error HOT 4
- plot.vsel shows extra xtick label HOT 1
- brmsfit cumulative model & Latent projection predictive feature selection HOT 3
- Convergence checks for additive submodels HOT 4
- Allow dispersion parameter to be observation-specific HOT 1
- Smoothing of cross-validated predictive performance HOT 1
- Add R2 as performance statistic HOT 4
- Error when getting reference model with k-fold cross validation for cv_varsel HOT 9
- Truncated response distributions
- Clarify doc for refit_prj HOT 2
- Don't warn about slightly large khats HOT 1
- Remove QR=TRUE from the vignette
- Guide on latent_ll_oscale for other families e.g. zero-inflated negative binomial, hurdle negbinom HOT 5
- Using projpred in multi-response models with group correlated hierarchical effects HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from projpred.