juliastats / glmnet.jl Goto Github PK
View Code? Open in Web Editor NEWJulia wrapper for fitting Lasso/ElasticNet GLM models using glmnet
License: Other
Julia wrapper for fitting Lasso/ElasticNet GLM models using glmnet
License: Other
What do you of capitalizing this as GLMnet? I know that the official Matlab pag uses Glmnet as capitalization in one place, but it seems strange to me that GLM isn't all in caps.
Hello,
I'd like to cite this repository in a paper.
It seems like the recommended way to cite Github repositories is with a DOI:
It seems like this repository is pretty inactive. But I'd like to give credit where it's due.
Is there an owner who wants to register a DOI? E.g., with Zenodo?
how to simply get coef(cvfit, s = "lambda.min") as R does
Is there a way to use warm starts for glmnet and glmnetcv?
The R wrapper seems to have the warm start capability.
Thanks!
Is this behavior intended?
MWE:
X = rand(1000, 2)
y = X[:,1] + randn(1000)
constraints = [-1 -1; Inf Inf]
res = glmnetcv(X, y, constraints = constraints)
julia> constraints
2×2 Matrix{Float64}:
-4.69035e-7 -4.57555e-7
Inf Inf
The fortran code has methods for CSR predictor matrixes. Currently julia only supports sparse matrixes in CSC format in Base, right? How difficult would it be to use glmnet's sparse capabilities?
(I might take a crack at this at some point but I figured I'd open an issue to see if someone else wants to work on it too.)
Hello, how do you use Multinomial() or other methods in the family option? I tried
glmnet(X,Y, family::Multinomial())
and glmnet(X,Y, family=:Multinomial())
where X and Y are both 2-D arrays.
Probably an easy question, but I just couldn't figure out how to use it. Thanks for your help.
When I try to set the variable family as Binomial, glmnet just give me an error:
ERROR: MethodError: no method matching glmnet!(::Matrix{Float64}, ::Vector{Float64}, ::Binomial{Float64}; alpha=1, standardize=false, intercept=true, lambda=[0.0005])
Closest candidates are:
glmnet!(::Matrix{Float64}, ::Matrix{Float64}, ::Binomial; offsets, weights, alpha, penalty_factor, constraints, dfmax, pmax, nlambda, lambda_min_ratio, lambda, tol, standardize, intercept, maxit, algorithm) at ~/.julia/packages/GLMNet/Bzpup/src/GLMNet.jl:337
I use the version 1.8.
Hello, I stumbled over a few things trying to follow the example problem.
It looks like the DataArrays package is needed for the definition of rep in the glmnetcv function.
Also, I had a little trouble with the convert method called from within the show function:
ERROR: no method convert(Type{Array{T,2}}, CompressedPredictorMatrix)
in convert at base.jl:11
Changing the function name "convert" to something else worked, but being brand new to Julia, I don't yet know the proper way to fix this.
I am encountering an error when trying to run a binomial logistic ridge regression. The error does not occur when using the default least squares regression.
import GLMNet
cv_guass = GLMNet.glmnetcv(X, y, GLMNet.Normal(), alpha=0.0)
cv_binom = GLMNet.glmnetcv(X, y, GLMNet.Binomial(), alpha=0.0)
The final line of code produces the following error:
no method glmnet!(Array{Any,1},Array{Float64,2},Array{Float64,1},Binomial)
at In[100]:3
in glmnet at /home/dhimmels/.julia/v0.2/GLMNet/src/GLMNet.jl:346
in glmnetcv at /home/dhimmels/.julia/v0.2/GLMNet/src/GLMNet.jl:382
I am new to julia, so perhaps I'm missing something. Thanks.
From README:
For logistic models, y is either a string vector or a m x 2 matrix
But the following doesn't work
using GLMNet
y = ["M", "B", "M", "B"]
X = rand(4, 10)
glmnet(X, y, Binomial())
MethodError: no method matching glmnet(::Matrix{Float64}, ::Vector{String}, ::Binomial{Float64})
Closest candidates are:
glmnet(::AbstractMatrix{T} where T, ::AbstractVector{T} where T, ::AbstractVector{T} where T) at /home/users/bbchu/.julia/packages/GLMNet/C8WKF/src/CoxNet.jl:151
glmnet(::AbstractMatrix{T} where T, ::AbstractVector{T} where T, ::AbstractVector{T} where T, ::CoxPH; kw...) at /home/users/bbchu/.julia/packages/GLMNet/C8WKF/src/CoxNet.jl:151
glmnet(::Matrix{Float64}, ::Vector{Float64}, ::Distribution; kw...) at /home/users/bbchu/.julia/packages/GLMNet/C8WKF/src/GLMNet.jl:485
...
Fortunately if y
is a matrix with 2 columns, it does work
y = [1 0; 0 1; 0 1; 1 0]
X = rand(4, 10)
glmnet(X, y, Binomial())
Logistic GLMNet Solution Path (100 solutions for 10 predictors in 833 passes):
────────────────────────────────
df pct_dev λ
────────────────────────────────
[1] 0 0.0 0.476672
[2] 1 0.0582906 0.455006
[3] 1 0.11166 0.434325
[4] 1 0.160737 0.414585
[5] 1 0.206039 0.395741
[6] 1 0.248 0.377754
[7] 1 0.286986 0.360585
...
The glmnet source in this repository is outdated, dating back to 2015. The glmnet fortran backbone has since been updated several times. Please consider updating to the latest version.
The source files can be found at https://github.com/cran/glmnet/tree/master/src
The README states constraints
should be an n x 2 matrix. However, it is actually a 2 x n matrix.
I'd prefer if it were n x 2, but if it's supposed to be 2 x n, it would be good to fix the README in order to avoid confusion.
The URL of this package does not match that stored in METADATA.jl.
cc: @ararslan
How to specify the family like R programming glmnetcv(X, y,family = "binomial")?
Line 195 of GLMNet.jl:
warn("glmnet: number of non-zero coefficients along path exceeds $nx at $(maxit+10000)th lambda value")
But the variable nx
is not defined anywhere.
It appears that the build of this package, using gfortran on OS X 10.9.5, tries to use the -shared option which is not present.
julia> Pkg.build("GLMNet")
INFO: Building GLMNet
i686-apple-darwin8-gfortran-4.2: unrecognized option '-shared'
Undefined symbols for architecture x86_64:
"MAIN_", referenced from:
_main in libgfortranbegin.a(fmain.o)
ld: symbol(s) not found for architecture x86_64
collect2: ld returned 1 exit status
========================================[ ERROR: GLMNet ]========================================
failed process: Process(gfortran -m64 -fdefault-real-8 -ffixed-form -fPIC -shared -O3 glmnet3.f90 -o libglmnet.so
, ProcessExited(1)) [1]
while loading /Users/psz/.julia/v0.3/GLMNet/deps/build.jl, in expression starting on line 3
The GLMNet package includes the Relaxed Lasso option, which recent research has shown performs very well.
Would it be possible for GLMNet.jl to allow this?
glmnetcv(X, y)
in quickstart gives error:
MethodError: glmnetcv(::Array{Float64,2}, ::Array{Float64,1}) is ambiguous
PackageEvaluator.jl is a script that runs nightly. It attempts to load all Julia packages and run their tests (if available) on both the stable version of Julia (0.2) and the nightly build of the unstable version (0.3). The results of this script are used to generate a package listing enhanced with testing results.
Tests pass.
Package doesn't load.
Tests pass.
means that PackageEvaluator found the tests for your package, executed them, and they all passed.
Package doesn't load.
means that PackageEvaluator did not find tests for your package. Additionally, trying to load your package with using
failed.
This issue was filed because your testing status became worse. No additional issues will be filed if your package remains in this state, and no issue will be filed if it improves. If you'd like to opt-out of these status-change messages, reply to this message saying you'd like to and @IainNZ will add an exception. If you'd like to discuss PackageEvaluator.jl please file an issue at the repository. For example, your package may be untestable on the test machine due to a dependency - an exception can be added.
Test log:
INFO: Installing ArrayViews v0.4.6
INFO: Installing DataArrays v0.1.12
INFO: Installing DataFrames v0.5.6
INFO: Installing Distributions v0.5.2
INFO: Installing GLMNet v0.0.2
INFO: Installing GZip v0.2.13
INFO: Installing PDMats v0.2.0
INFO: Installing Reexport v0.0.1
INFO: Installing SortingAlgorithms v0.0.1
INFO: Installing StatsBase v0.5.3
INFO: Building GLMNet
INFO: Package database updated
Warning: could not import Sort.sortby into DataFrames
Warning: could not import Sort.sortby! into DataFrames
ERROR: repl_show not defined
in include at ./boot.jl:245
in include_from_node1 at ./loading.jl:128
in include at ./boot.jl:245
in include_from_node1 at ./loading.jl:128
in reload_path at loading.jl:152
in _require at loading.jl:67
in require at loading.jl:54
in include at ./boot.jl:245
in include_from_node1 at ./loading.jl:128
in reload_path at loading.jl:152
in _require at loading.jl:67
in require at loading.jl:51
in include at ./boot.jl:245
in include_from_node1 at loading.jl:128
in process_options at ./client.jl:285
in _start at ./client.jl:354
while loading /home/idunning/pkgtest/.julia/v0.3/DataFrames/src/dataframe/reshape.jl, in expression starting on line 163
while loading /home/idunning/pkgtest/.julia/v0.3/DataFrames/src/DataFrames.jl, in expression starting on line 110
while loading /home/idunning/pkgtest/.julia/v0.3/GLMNet/src/GLMNet.jl, in expression starting on line 2
while loading /home/idunning/pkgtest/.julia/v0.3/GLMNet/testusing.jl, in expression starting on line 1
INFO: Package database updated
Note this is possibly due to removal of deprecated functions in Julia 0.3-rc1: JuliaLang/julia#7609
Hello. I just saw that GLMNet v0.1.0 has been released according to the GLMNet.jl site.
However, when I ran Pkg.update()
, my current GLMNet v0.0.5 was not updated to v0.1.0. I also tried to remove GLMNet v0.0.5 and reinstall GLMNet via Pkg.add("GLMNet")
, but still v0.0.5 was installed. Why does this happen and how can I get the v0.1.0 without using Pkg.checkout("GLMNet")
? Since v0.1.0 is an official release, I should be able to get that version simply by Pkg.update()
in principle, correct?
Best,
BVPs
In trying to get the cross-validation output from glmnet
in R and GLMNet.jl
to conform, I find the losses differ even when everything else (lambda sequence, fold id) is the same across the two. This yields an argmin(cv.meanloss)
different from which.min(cv$cvm)
(R), and it sometimes matters. What is the source of the difference?
Example:
R
require(glmnet)
data <- iris
foldid <- rep(1:10, nrow(data) / 10)
x <- model.matrix(data = data, ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width)
cvl <- cv.glmnet(y = data$Species, x = x,
family = "multinomial", alignment = "fraction", foldid = foldid)
round(cvl$lambda, 8)
cvl$cvm
2.1972246 2.0531159 1.9324868 ...
julia
using Pkg
Pkg.add("RDatasets")
Pkg.add("GLMNet")
Pkg.add("GLM")
using RDatasets, GLMNet, GLM
iris = dataset("datasets", "iris")
fml = @formula(Species ~ SepalLength + SepalWidth + PetalLength + PetalWidth + SepalLength)
x = ModelMatrix(ModelFrame(fml, iris)).m
foldid = repeat(1:10, Int(size(iris, 1) / 10))
cvl = glmnetcv( x, iris.Species; folds = foldid )
cvl.lambda'
cvl.meanloss
2.1955639962247964
2.0530748153377423
1.9324668652650965
...
When I try to build the GLMNet.jl it tells me that the file is not found. I dug around and it looks like the problem is inside the build.jl file:
pic = @windows ? "" : "-fPIC"
run(`gfortran -m$WORD_SIZE -fdefault-real-8 -ffixed-form $pic -shared -O3 glmnet5.f90 -o libglmnet.so`)
and when the run( )
line is read, under windows it will change $pic
to ''
instead of an actual blank, which messes up the build. When I removed the $pic
from that line, it built fine
This issue is being filed by a script, but if you reply, I will see it.
PackageEvaluator.jl is a script that runs nightly. It attempts to load all Julia packages and run their test (if available) on both the stable version of Julia (0.2) and the nightly build of the unstable version (0.3).
The results of this script are used to generate a package listing enhanced with testing results.
The status of this package, GLMNet, on...
'No tests, but package loads.' can be due to their being no tests (you should write some if you can!) but can also be due to PackageEvaluator not being able to find your tests. Consider adding a test/runtests.jl
file.
'Package doesn't load.' is the worst-case scenario. Sometimes this arises because your package doesn't have BinDeps support, or needs something that can't be installed with BinDeps. If this is the case for your package, please file an issue and an exception can be made so your package will not be tested.
This automatically filed issue is a one-off message. Starting soon, issues will only be filed when the testing status of your package changes in a negative direction (gets worse). If you'd like to opt-out of these status-change messages, reply to this message.
Maybe i'm blind, but how can I fit a linear lasso model on multiple dependent variables?
There doesn't seem to be a method for
X = rand(100, 10)
Y = rand(100, 4)
path = glmnet(X, Y, Normal())
# or
path = glmnet(X, Y, MvNormal())
It could be nice to return the loglik for each lambda, this would make it easier to choose the set of betas based on BIC or AIC.
From basic regression model from quick start https://github.com/JuliaStats/GLMNet.jl, if y
is of type Array{Int64,1}.
glmnet will give ambiguous error
ERROR: MethodError: glmnet(::Array{Float64,2}, ::Array{Int64,1}) is ambiguous. Candidates:
glmnet(X::Array{Float64,2}, y; kw...) in GLMNet at C:\Users\ly.julia\packages\GLMNet\1uQom\src\Multinomial.jl:191
glmnet(X::AbstractArray{T,2} where T, y::AbstractArray{var"#s77",1} where var"#s77"<:Number) in GLMNet at C:\Users\ly.julia\packages\GLMNet\1uQom\src\GLMNet.jl:492
Possible fix, define
glmnet(::Array{Float64,2}, ::AbstractArray{var"#s77",1} where var"#s77"<:Number)
This issue is used to trigger TagBot; feel free to unsubscribe.
If you haven't already, you should update your TagBot.yml
to include issue comment triggers.
Please see this post on Discourse for instructions and more details.
If you'd like for me to do this for you, comment TagBot fix
on this issue.
I'll open a PR within a few hours, please be patient!
The null deviations seem to be incorrect when I run GLMNet.glmnet
-- I'm getting absurdly small numbers.
Example: this code
# Simulated data set
X_sim = randn((100,10));
beta_sim = randn(10);
y_sim = randn(100) .+ (X_sim * beta_sim .+ 3.14)
sim_path = GLMNet.glmnet(X_sim, y_sim)
Produces the output:
Least Squares GLMNet Solution Path (64 solutions for 10 predictors in 266 passes):
───────────────────────────────
df pct_dev λ
───────────────────────────────
[1] 0 0.0 3.15936
[2] 1 0.0716368 2.87869
[3] 1 0.131111 2.62295
[4] 1 0.180487 2.38994
[5] 1 0.221481 2.17762
[6] 2 0.270351 1.98417
.
.
.
Which seems fine -- but then when I run
sim_path.null_dev
I get an absurdly small number:
6.240013019814641e-34
In contrast, when I compute the null deviance (sum of squares) myself:
size(X_sim, 1) * var(y_sim)
I get
2389.5611952108716
Have I misunderstood something? It's easy enough to compute the null deviance on my own, but it seems like GLMNet.jl isn't computing it as advertised.
And I don't see it covered in your unit tests. So maybe this was a small blind spot.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.