juliastats / glmnet.jl Goto Github PK

View Code? Open in Web Editor NEW

96.0 96.0 35.0 302 KB

Julia wrapper for fitting Lasso/ElasticNet GLM models using glmnet

License: Other

Julia 100.00%

glmnet.jl's People

Contributors

Stargazers

Watchers

glmnet.jl's Issues

Capitalization

What do you of capitalizing this as GLMnet? I know that the official Matlab pag uses Glmnet as capitalization in one place, but it seems strange to me that GLM isn't all in caps.

Register a DOI for citation purposes?

Hello,

I'd like to cite this repository in a paper.
It seems like the recommended way to cite Github repositories is with a DOI:

https://help.github.com/en/github/creating-cloning-and-archiving-repositories/referencing-and-citing-content

It seems like this repository is pretty inactive. But I'd like to give credit where it's due.
Is there an owner who wants to register a DOI? E.g., with Zenodo?

get coef(cvfit, s = "lambda.min")

how to simply get coef(cvfit, s = "lambda.min") as R does

Warm starts

Is there a way to use warm starts for glmnet and glmnetcv?
The R wrapper seems to have the warm start capability.

Thanks!

`constraints` matrix changes after running glmnetcv

Is this behavior intended?

MWE:

X = rand(1000, 2)
y = X[:,1] + randn(1000)
constraints = [-1 -1; Inf Inf]
res = glmnetcv(X, y, constraints = constraints)

julia> constraints
2×2 Matrix{Float64}:
 -4.69035e-7  -4.57555e-7
 Inf          Inf

Allow for sparse predictor matrix

The fortran code has methods for CSR predictor matrixes. Currently julia only supports sparse matrixes in CSC format in Base, right? How difficult would it be to use glmnet's sparse capabilities?

(I might take a crack at this at some point but I figured I'd open an issue to see if someone else wants to work on it too.)

How to use Multinomial family

Hello, how do you use Multinomial() or other methods in the family option? I tried

glmnet(X,Y, family::Multinomial()) and glmnet(X,Y, family=:Multinomial()) where X and Y are both 2-D arrays.

Probably an easy question, but I just couldn't figure out how to use it. Thanks for your help.

I cannot set the family as Binomial

When I try to set the variable family as Binomial, glmnet just give me an error:

ERROR: MethodError: no method matching glmnet!(::Matrix{Float64}, ::Vector{Float64}, ::Binomial{Float64}; alpha=1, standardize=false, intercept=true, lambda=[0.0005])
Closest candidates are:
  glmnet!(::Matrix{Float64}, ::Matrix{Float64}, ::Binomial; offsets, weights, alpha, penalty_factor, constraints, dfmax, pmax, nlambda, lambda_min_ratio, lambda, tol, standardize, intercept, maxit, algorithm) at ~/.julia/packages/GLMNet/Bzpup/src/GLMNet.jl:337

I use the version 1.8.

rep and convert

Hello, I stumbled over a few things trying to follow the example problem.

It looks like the DataArrays package is needed for the definition of rep in the glmnetcv function.

Also, I had a little trouble with the convert method called from within the show function:

ERROR: no method convert(Type{Array{T,2}}, CompressedPredictorMatrix)
in convert at base.jl:11

Changing the function name "convert" to something else worked, but being brand new to Julia, I don't yet know the proper way to fix this.

Error attempting Logistic Regression

I am encountering an error when trying to run a binomial logistic ridge regression. The error does not occur when using the default least squares regression.

import GLMNet
cv_guass = GLMNet.glmnetcv(X, y, GLMNet.Normal(), alpha=0.0)
cv_binom = GLMNet.glmnetcv(X, y, GLMNet.Binomial(), alpha=0.0)

The final line of code produces the following error:

no method glmnet!(Array{Any,1},Array{Float64,2},Array{Float64,1},Binomial)
at In[100]:3
 in glmnet at /home/dhimmels/.julia/v0.2/GLMNet/src/GLMNet.jl:346
 in glmnetcv at /home/dhimmels/.julia/v0.2/GLMNet/src/GLMNet.jl:382

I am new to julia, so perhaps I'm missing something. Thanks.

Logistic regression fails if y is a string of vectors

From README:

For logistic models, y is either a string vector or a m x 2 matrix

But the following doesn't work

using GLMNet
y = ["M", "B", "M", "B"]
X = rand(4, 10)
glmnet(X, y, Binomial())

MethodError: no method matching glmnet(::Matrix{Float64}, ::Vector{String}, ::Binomial{Float64})
Closest candidates are:
  glmnet(::AbstractMatrix{T} where T, ::AbstractVector{T} where T, ::AbstractVector{T} where T) at /home/users/bbchu/.julia/packages/GLMNet/C8WKF/src/CoxNet.jl:151
  glmnet(::AbstractMatrix{T} where T, ::AbstractVector{T} where T, ::AbstractVector{T} where T, ::CoxPH; kw...) at /home/users/bbchu/.julia/packages/GLMNet/C8WKF/src/CoxNet.jl:151
  glmnet(::Matrix{Float64}, ::Vector{Float64}, ::Distribution; kw...) at /home/users/bbchu/.julia/packages/GLMNet/C8WKF/src/GLMNet.jl:485
  ...

Fortunately if y is a matrix with 2 columns, it does work

y = [1 0; 0 1; 0 1; 1 0]
X = rand(4, 10)
glmnet(X, y, Binomial())

Logistic GLMNet Solution Path (100 solutions for 10 predictors in 833 passes):
────────────────────────────────
       df    pct_dev           λ
────────────────────────────────
  [1]   0  0.0        0.476672
  [2]   1  0.0582906  0.455006
  [3]   1  0.11166    0.434325
  [4]   1  0.160737   0.414585
  [5]   1  0.206039   0.395741
  [6]   1  0.248      0.377754
  [7]   1  0.286986   0.360585
  ...

Update glmnet source

The glmnet source in this repository is outdated, dating back to 2015. The glmnet fortran backbone has since been updated several times. Please consider updating to the latest version.

The source files can be found at https://github.com/cran/glmnet/tree/master/src

Package compatibility caps

Ref: https://discourse.julialang.org/t/package-compatibility-caps/15301

Constraints dimensions on README

The README states constraints should be an n x 2 matrix. However, it is actually a 2 x n matrix.
I'd prefer if it were n x 2, but if it's supposed to be 2 x n, it would be good to fix the README in order to avoid confusion.

Error tagging new release

The URL of this package does not match that stored in METADATA.jl.
cc: @ararslan

how to specify family

How to specify the family like R programming glmnetcv(X, y,family = "binomial")?

possible undefined variable in jerr warning?

Line 195 of GLMNet.jl:

        warn("glmnet: number of non-zero coefficients along path exceeds $nx at $(maxit+10000)th lambda value")

But the variable nx is not defined anywhere.

Problem with Pkg.add("GLMNet") in Julia 0.3.3 on Mac, OS 10.9.5

It appears that the build of this package, using gfortran on OS X 10.9.5, tries to use the -shared option which is not present.

julia> Pkg.build("GLMNet")
INFO: Building GLMNet
i686-apple-darwin8-gfortran-4.2: unrecognized option '-shared'
Undefined symbols for architecture x86_64:
"MAIN_", referenced from:
_main in libgfortranbegin.a(fmain.o)
ld: symbol(s) not found for architecture x86_64
collect2: ld returned 1 exit status
========================================[ ERROR: GLMNet ]========================================

failed process: Process(gfortran -m64 -fdefault-real-8 -ffixed-form -fPIC -shared -O3 glmnet3.f90 -o libglmnet.so, ProcessExited(1)) [1]
while loading /Users/psz/.julia/v0.3/GLMNet/deps/build.jl, in expression starting on line 3

Relaxed Lasso

The GLMNet package includes the Relaxed Lasso option, which recent research has shown performs very well.
Would it be possible for GLMNet.jl to allow this?

glmnetcv(X, y) in quickstart gives error

glmnetcv(X, y) in quickstart gives error:

MethodError: glmnetcv(::Array{Float64,2}, ::Array{Float64,1}) is ambiguous

[PkgEval] GLMNet may have a testing issue on Julia 0.3 (2014-07-14)

PackageEvaluator.jl is a script that runs nightly. It attempts to load all Julia packages and run their tests (if available) on both the stable version of Julia (0.2) and the nightly build of the unstable version (0.3). The results of this script are used to generate a package listing enhanced with testing results.

On Julia 0.3

On 2014-07-12 the testing status was Tests pass.
On 2014-07-14 the testing status changed to Package doesn't load.

Tests pass. means that PackageEvaluator found the tests for your package, executed them, and they all passed.

Package doesn't load. means that PackageEvaluator did not find tests for your package. Additionally, trying to load your package with using failed.

This issue was filed because your testing status became worse. No additional issues will be filed if your package remains in this state, and no issue will be filed if it improves. If you'd like to opt-out of these status-change messages, reply to this message saying you'd like to and @IainNZ will add an exception. If you'd like to discuss PackageEvaluator.jl please file an issue at the repository. For example, your package may be untestable on the test machine due to a dependency - an exception can be added.

Test log:

INFO: Installing ArrayViews v0.4.6
INFO: Installing DataArrays v0.1.12
INFO: Installing DataFrames v0.5.6
INFO: Installing Distributions v0.5.2
INFO: Installing GLMNet v0.0.2
INFO: Installing GZip v0.2.13
INFO: Installing PDMats v0.2.0
INFO: Installing Reexport v0.0.1
INFO: Installing SortingAlgorithms v0.0.1
INFO: Installing StatsBase v0.5.3
INFO: Building GLMNet
INFO: Package database updated
Warning: could not import Sort.sortby into DataFrames
Warning: could not import Sort.sortby! into DataFrames
ERROR: repl_show not defined
 in include at ./boot.jl:245
 in include_from_node1 at ./loading.jl:128
 in include at ./boot.jl:245
 in include_from_node1 at ./loading.jl:128
 in reload_path at loading.jl:152
 in _require at loading.jl:67
 in require at loading.jl:54
 in include at ./boot.jl:245
 in include_from_node1 at ./loading.jl:128
 in reload_path at loading.jl:152
 in _require at loading.jl:67
 in require at loading.jl:51
 in include at ./boot.jl:245
 in include_from_node1 at loading.jl:128
 in process_options at ./client.jl:285
 in _start at ./client.jl:354
while loading /home/idunning/pkgtest/.julia/v0.3/DataFrames/src/dataframe/reshape.jl, in expression starting on line 163
while loading /home/idunning/pkgtest/.julia/v0.3/DataFrames/src/DataFrames.jl, in expression starting on line 110
while loading /home/idunning/pkgtest/.julia/v0.3/GLMNet/src/GLMNet.jl, in expression starting on line 2
while loading /home/idunning/pkgtest/.julia/v0.3/GLMNet/testusing.jl, in expression starting on line 1
INFO: Package database updated

Note this is possibly due to removal of deprecated functions in Julia 0.3-rc1: JuliaLang/julia#7609

V0.1.0 available?

Hello. I just saw that GLMNet v0.1.0 has been released according to the GLMNet.jl site.
However, when I ran Pkg.update(), my current GLMNet v0.0.5 was not updated to v0.1.0. I also tried to remove GLMNet v0.0.5 and reinstall GLMNet via Pkg.add("GLMNet"), but still v0.0.5 was installed. Why does this happen and how can I get the v0.1.0 without using Pkg.checkout("GLMNet")? Since v0.1.0 is an official release, I should be able to get that version simply by Pkg.update() in principle, correct?
Best,
BVPs

`cv.meanloss` differs from `cv$cvm` in R

In trying to get the cross-validation output from glmnet in R and GLMNet.jl to conform, I find the losses differ even when everything else (lambda sequence, fold id) is the same across the two. This yields an argmin(cv.meanloss) different from which.min(cv$cvm) (R), and it sometimes matters. What is the source of the difference?

Example:

require(glmnet)

data <- iris
foldid <- rep(1:10, nrow(data) / 10)
x <- model.matrix(data = data, ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width)

cvl <- cv.glmnet(y = data$Species, x = x, 
                 family = "multinomial", alignment = "fraction", foldid = foldid)

round(cvl$lambda, 8)

cvl$cvm
2.1972246 2.0531159 1.9324868 ...

julia

using Pkg

Pkg.add("RDatasets")
Pkg.add("GLMNet")
Pkg.add("GLM")

using RDatasets, GLMNet, GLM

iris = dataset("datasets", "iris")

fml = @formula(Species ~ SepalLength + SepalWidth + PetalLength + PetalWidth + SepalLength)
x = ModelMatrix(ModelFrame(fml, iris)).m
foldid = repeat(1:10, Int(size(iris, 1) / 10))
cvl = glmnetcv( x, iris.Species; folds = foldid )

cvl.lambda'

cvl.meanloss
2.1955639962247964
2.0530748153377423
1.9324668652650965
...

Building Error in Windows

When I try to build the GLMNet.jl it tells me that the file is not found. I dug around and it looks like the problem is inside the build.jl file:

pic = @windows ? "" : "-fPIC"
run(`gfortran -m$WORD_SIZE -fdefault-real-8 -ffixed-form $pic -shared -O3 glmnet5.f90 -o libglmnet.so`)

and when the run( ) line is read, under windows it will change $pic to '' instead of an actual blank, which messes up the build. When I removed the $pic from that line, it built fine

[PackageEvaluator.jl] Your package GLMNet may have a testing issue.

This issue is being filed by a script, but if you reply, I will see it.

PackageEvaluator.jl is a script that runs nightly. It attempts to load all Julia packages and run their test (if available) on both the stable version of Julia (0.2) and the nightly build of the unstable version (0.3).

The results of this script are used to generate a package listing enhanced with testing results.

The status of this package, GLMNet, on...

Julia 0.2 is 'Package doesn't load.'
Julia 0.3 is 'Tests pass.'

'No tests, but package loads.' can be due to their being no tests (you should write some if you can!) but can also be due to PackageEvaluator not being able to find your tests. Consider adding a test/runtests.jl file.

'Package doesn't load.' is the worst-case scenario. Sometimes this arises because your package doesn't have BinDeps support, or needs something that can't be installed with BinDeps. If this is the case for your package, please file an issue and an exception can be made so your package will not be tested.

This automatically filed issue is a one-off message. Starting soon, issues will only be filed when the testing status of your package changes in a negative direction (gets worse). If you'd like to opt-out of these status-change messages, reply to this message.

Multivariate normal L1 regression

Maybe i'm blind, but how can I fit a linear lasso model on multiple dependent variables?

There doesn't seem to be a method for

X = rand(100, 10)
Y = rand(100, 4)
path = glmnet(X, Y,  Normal())
# or 
path = glmnet(X, Y,  MvNormal())

Feature Request: Loglikelihood function

It could be nice to return the loglik for each lambda, this would make it easier to choose the set of betas based on BIC or AIC.

MethodError: glmnet(::Array{Float64,2}, ::Array{Int64,1}) is ambiguous

From basic regression model from quick start https://github.com/JuliaStats/GLMNet.jl, if y is of type Array{Int64,1}.

glmnet will give ambiguous error

ERROR: MethodError: glmnet(::Array{Float64,2}, ::Array{Int64,1}) is ambiguous. Candidates:
glmnet(X::Array{Float64,2}, y; kw...) in GLMNet at C:\Users\ly.julia\packages\GLMNet\1uQom\src\Multinomial.jl:191
glmnet(X::AbstractArray{T,2} where T, y::AbstractArray{var"#s77",1} where var"#s77"<:Number) in GLMNet at C:\Users\ly.julia\packages\GLMNet\1uQom\src\GLMNet.jl:492
Possible fix, define
glmnet(::Array{Float64,2}, ::AbstractArray{var"#s77",1} where var"#s77"<:Number)

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Incorrect `null_dev`s

The null deviations seem to be incorrect when I run GLMNet.glmnet -- I'm getting absurdly small numbers.

Example: this code

# Simulated data set
X_sim = randn((100,10));
beta_sim = randn(10);
y_sim = randn(100) .+ (X_sim * beta_sim  .+ 3.14)

sim_path = GLMNet.glmnet(X_sim, y_sim)

Produces the output:

Least Squares GLMNet Solution Path (64 solutions for 10 predictors in 266 passes):
───────────────────────────────
      df    pct_dev           λ
───────────────────────────────
 [1]   0  0.0        3.15936   
 [2]   1  0.0716368  2.87869   
 [3]   1  0.131111   2.62295   
 [4]   1  0.180487   2.38994   
 [5]   1  0.221481   2.17762   
 [6]   2  0.270351   1.98417   
.
.
.

Which seems fine -- but then when I run

sim_path.null_dev

I get an absurdly small number:

6.240013019814641e-34

In contrast, when I compute the null deviance (sum of squares) myself:

size(X_sim, 1) * var(y_sim)

I get

2389.5611952108716

Have I misunderstood something? It's easy enough to compute the null deviance on my own, but it seems like GLMNet.jl isn't computing it as advertised.

And I don't see it covered in your unit tests. So maybe this was a small blind spot.

juliastats / glmnet.jl Goto Github PK

glmnet.jl's People

Contributors

Stargazers

Watchers

Forkers

glmnet.jl's Issues

On Julia 0.3

Recommend Projects

Recommend Topics

Recommend Org