Comments (12)
I think users (like me) will expect that the default behavior of the algorithm, especially as simple as a logistic regression, is to work out of the box and provide a reasonnable fit with the default hyperparameters.
I don't think you can expect a default that is not scaled to work well across the board for users. More generally I don't think you can expect a good default for this full stop. These parameters must be tuned and the tuning should not be affected by sample size.
I agree with both. I think scale_penalty_with_samples = true
is the better default. However, currently the default lambda = 1
means rather strong regularisation when the input is (close to) standardised, which may be the most common case (see below). In this case, users usually have to change lambda to avoid underfitting. We could also set the default to lambda = 1e-8
(or some other small value), with the argument that it basically doesn't affect the unregularised solution in the non-separable case while still avoiding runaway solutions in the separable case. Users would usually have to deal with overfitting.
So, if we have good evidence that 1) (close to) standardised input is the most common case and 2) the majority of users perceives (potentially) overfitting as a more reasonable fit than (potentially) underfitting, I would argue for lowering the default lambda
.
If we write the solution of logistic regression as lambda = 1e-8
the prediction for the correct class would be basically 1.
from mljlinearmodels.jl.
why not completely lambda = 0
We could do this. I just don't like too much the fact that, in the separable case, the solution would have infinite norm, lambda = eps()
.
from mljlinearmodels.jl.
@tlienart This is breaking, no? I think we need a breaking (minor) release not a patch. Or am I missing something?
from mljlinearmodels.jl.
Thanks @tlienart. I'm making a PR to General to yank 0.6.5 from the registry.
from mljlinearmodels.jl.
It's a convention on the objective function; the reason is to have the scale of the loss and the penalty be on the same grounds (so that if you have twice as much data, you don't have to change the regularisation) In the case of ridge for instance:
1/n ||y - Xb||^2 + lambda * ||b||^2
then this is equivalent to multiplying by n
.
from mljlinearmodels.jl.
PS: wait, I'm confused, you made those changes didn't you? I don't think anyone touched that logic since you did.
ah no it's not you it's @jbrea maybe he can chip in if you have further questions.
Note: in any case I think that parameter is best obtained via hyperparameter optimisation.
from mljlinearmodels.jl.
Thanks for the explanation @tlienart, I think users (like me) will expect that the default behavior of the algorithm, especially as simple as a logistic regression, is to work out of the box and provide a reasonnable fit with the default hyperparameters. This new hyperparameter does seem to mess things up as far as I can see, the output is almost like a random biased coin toss. It would probably make more sense to default as false doesn't it? Moreover this would have been a non breaking change from 0.5.7 if I followed correctly the history.
from mljlinearmodels.jl.
Please have a look at #108 for the reasoning behind it, specifically the tuning.
I don't think you can expect a default that is not scaled to work well across the board for users. More generally I don't think you can expect a good default for this full stop. These parameters must be tuned and the tuning should not be affected by sample size.
from mljlinearmodels.jl.
Thanks, I like this suggestion
from mljlinearmodels.jl.
Also agree, why not completely lambda=0 which is vanilla logistic regression?
from mljlinearmodels.jl.
Thanks both for the discussion, default set to eps()
, patch release under way.
from mljlinearmodels.jl.
Ok, would you mind doing it? Thanks!
Done! 5bb7c6d#commitcomment-80390857
from mljlinearmodels.jl.
Related Issues (20)
- Docs for L1+std HOT 1
- `LogisticClassifier` lambda and gamma are easy to confuse HOT 1
- Problems when only one class manifest in the training target HOT 5
- TagBot not working HOT 7
- Ncvreg wrapper HOT 1
- `predict` throws an error for `MultinomialClassifier` on crabs dataset
- Unexpected behaviour with LogisticClassifier HOT 5
- Add MLJ compliant doc strings to all models
- Help with documentation review HOT 9
- Test proxgrad with gamma=0 HOT 1
- Add MLJTestInterface tests
- cov-only training for L2Loss HOT 3
- Testing quantreg vs MLJLinearModels HOT 7
- should scale_penalty_with_samples = true be defualt? HOT 4
- Expand docs to include cut and past examples for different Regressors HOT 3
- box (or just positive) constraints on enet OLS HOT 4
- generalized / group lasso HOT 1
- Feature request: Support for tables HOT 3
- Huber breakpoint parameter in quantile form? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mljlinearmodels.jl.