Giter Club home page Giter Club logo

nlsolvers.jl's Introduction

NLSolvers

Build Status Code Coverage

NLSolvers provides optimization, curve fitting, and equation solving functionalities for Julia. The goal is to provide a set of robust and flexible methods that run fast. Currently, the package does not try to implement any automatic generation of unspecified functions (gradients, Hessians, Hessian-vector products) using AD.

NLSolvers.jl uses different problem types for different problems

  • OptimizationProblem for optimization problems
  • NEqProblem for non-linear equations problems
  • FixedPointProblem for non-linear equations problems

Optimization Problems

Take the following scalar objective (with scalar input)

using NLSolvers
function objective(x)
    fx = x^4 + sin(x)
end
function gradient(∇f, x)
    ∇f = 4x^3 + cos(x)
    return ∇f
end
objective_gradient(∇f, x) = objective(x), gradient(∇f, x)
function hessian(∇²f, x)
    ∇²f = 12x^2 - sin(x)
    return ∇²f
end
function objective_gradient_hessian(∇f, ∇²f, x)
    f, ∇f = objective_gradient(∇f, x)
    ∇²f = hessian(∇²f, x)
    return f, ∇f, ∇²f
end
scalarobj = ScalarObjective(f=objective,
                            g=gradient,
                            fg=objective_gradient,
                            fgh=objective_gradient_hessian,
                            h=hessian)
optprob = OptimizationProblem(scalarobj; inplace=false) # scalar input, so not inplace

solve(optprob, 0.3, LineSearch(Newton()), OptimizationOptions())

With output

Results of minimization

* Algorithm:
  Newton's method with default linsolve with backtracking (no interp)

* Candidate solution:
  Final objective value:    -4.35e-01
  Final gradient norm:      3.07e-12

  Initial objective value:  3.04e-01
  Initial gradient norm:    1.06e+00

* Stopping criteria
  |x - x'|              = 6.39e-07 <= 0.00e+00 (false)
  |x - x'|/|x|          = 1.08e-06 <= 0.00e+00 (false)
  |f(x) - f(x')|        = 9.71e-13 <= 0.00e+00 (false)
  |f(x) - f(x')|/|f(x)| = 2.23e-12 <= 0.00e+00 (false)
  |g(x)|                = 3.07e-12 <= 1.00e-08 (true)
  |g(x)|/|g(x₀)|        = 2.88e-12 <= 0.00e+00 (false)

* Work counters
  Seconds run:   7.15e-06
  Iterations:    6

The problem types are especially useful when manifolds, bounds, and other constraints enter the picture.

Let's take the same problem as above but write it with arrays and mutating code style. The inplace keyword argument to the OptimizationProblem is used to apply the desired code paths internally. If set to true, cache arrays will be updated inplace and mutation is promised to be allowed for the input type(s). If set to false, no operations will mutate or happen in place.

using NLSolvers
function objective_ip(x)
    fx = x[1]^4 + sin(x[1])
end
function gradient_ip(∇f, x)
    ∇f[1] = 4x[1]^3 + cos(x[1])
    return ∇f
end
objective_gradient_ip(∇f, x) = objective_ip(x), gradient_ip(∇f, x)
function hessian_ip(∇²f, x)
    ∇²f[1] = 12x[1]^2 - sin(x[1])
    return ∇²f
end
function objective_gradient_hessian_ip(∇f, ∇²f, x)
    f, ∇f = objective_gradient_ip(∇f, x)
    ∇²f = hessian_ip(∇²f, x)
    return f, ∇f, ∇²f
end
scalarobj_ip = ScalarObjective(f=objective_ip,
                            g=gradient_ip,
                            fg=objective_gradient_ip,
                            fgh=objective_gradient_hessian_ip,
                            h=hessian_ip)
optprob_ip = OptimizationProblem(scalarobj_ip; inplace=true)

solve(optprob_ip, [0.3], LineSearch(Newton()), OptimizationOptions())

which gives

Results of minimization

* Algorithm:
  Newton's method with default linsolve with backtracking (no interp)

* Candidate solution:
  Final objective value:    -4.35e-01
  Final gradient norm:      3.07e-12

  Initial objective value:  3.04e-01
  Initial gradient norm:    1.06e+00

* Stopping criteria
  |x - x'|              = 6.39e-07 <= 0.00e+00 (false)
  |x - x'|/|x|          = 1.08e-06 <= 0.00e+00 (false)
  |f(x) - f(x')|        = 9.71e-13 <= 0.00e+00 (false)
  |f(x) - f(x')|/|f(x)| = 2.23e-12 <= 0.00e+00 (false)
  |g(x)|                = 3.07e-12 <= 1.00e-08 (true)
  |g(x)|/|g(x₀)|        = 2.88e-12 <= 0.00e+00 (false)

* Work counters
  Seconds run:   1.10e-05
  Iterations:    6

as above. Another set of examples could be SArray's and MArray's from the StaticArrays.jl package.

using StaticArrays
solve(optprob_ip, @MVector([0.3]), LineSearch(Newton()), OptimizationOptions())

which gives

Results of minimization

* Algorithm:
  Newton's method with default linsolve with backtracking (no interp)

* Candidate solution:
  Final objective value:    -4.35e-01
  Final gradient norm:      3.07e-12

  Initial objective value:  3.04e-01
  Initial gradient norm:    1.06e+00

* Stopping criteria
  |x - x'|              = 6.39e-07 <= 0.00e+00 (false)
  |x - x'|/|x|          = 1.08e-06 <= 0.00e+00 (false)
  |f(x) - f(x')|        = 9.71e-13 <= 0.00e+00 (false)
  |f(x) - f(x')|/|f(x)| = 2.23e-12 <= 0.00e+00 (false)
  |g(x)|                = 3.07e-12 <= 1.00e-08 (true)
  |g(x)|/|g(x₀)|        = 2.88e-12 <= 0.00e+00 (false)

* Work counters
  Seconds run:   5.68e-04
  Iterations:    6

So numbers, mutating array code and non-mutating array code is supported depending on the input to the problem type and initial x or state in general.

BigFloat eigen

To use the NWI algorithm in a TrustRegion algorithm, it is necessary to first using GenericLinearAlgebra.

Systems of Nonlinear Equations NEqProblem

To solve a system of non-linear equations you should use the NEqProblem type. First, we have to define a VectorObjective. We can try to solve for the roots in the problem defined by setting the gradient of the Rosenbrock test problem equal to zero.

function F_rosenbrock!(Fx, x)
    Fx[1] = 1 - x[1]
    Fx[2] = 10(x[2]-x[1]^2)
    return Fx
end
function J_rosenbrock!(Jx, x)
    Jx[1,1] = -1
    Jx[1,2] = 0
    Jx[2,1] = -20*x[1]
    Jx[2,2] = 10
    return Jx
end
function FJ_rosenbrock!(Fx, Jx, x)
    F_rosenbrock!(Fx, x)
    J_rosenbrock!(Jx, x)
    Fx, Jx
end
function Jvop_rosenbrock!(x)
    function JacV(Fv, v)
        Fv[1] = -1*v[1]
        Fv[2,] = -20*x[1]*v[1] + 10*v[2]
    end
    LinearMap(JacV, length(x))
end
vectorobj = NLSolvers.VectorObjective(F_rosenbrock!, J_rosenbrock!, FJ_rosenbrock!, Jvop_rosenbrock!)

and define a probem type that lets solve dispatch to the correct code

vectorprob = NEqProblem(vectorobj)

and we can solve using two variants of Newton's method. One that globalizes the solve using a trust-region based method and one that uses a line search


julia> solve(vectorprob, [5.0, 0.0], TrustRegion(Newton(), Dogleg()), NEqOptions())
Results of solving non-linear equations

* Algorithm:
  Newton's method with default linsolve with Dogleg{Nothing}

* Candidate solution:
  Final residual 2-norm:      5.24e-14
  Final residual Inf-norm:    5.24e-14

  Initial residual 2-norm:    6.25e+04
  Initial residual Inf-norm:  2.50e+02

* Stopping criteria
  |F(x')|               = 5.24e-14 <= 0.00e+00 (false)

* Work counters
  Seconds run:   1.91e-05
  Iterations:    2


julia> solve(vectorprob, [5.0, 0.0], LineSearch(Newton()), NEqOptions())
Results of solving non-linear equations

* Algorithm:
  Newton's method with default linsolve with backtracking (no interp)

* Candidate solution:
  Final residual 2-norm:      0.00e+00
  Final residual Inf-norm:    0.00e+00

  Initial residual 2-norm:    2.50e+02
  Initial residual Inf-norm:  2.50e+02

* Stopping criteria
  |F(x')|               = 0.00e+00 <= 0.00e+00 (true)

* Work counters
  Seconds run:   1.00e-05
  Iterations:    2

Univariate optimization

The only method that is exclusively univariate is Brent's method for function minimization BrentMin.

Multivariate optimization

Sampling based

  • SimulatedAnnealing
  • PureRandomSearch
  • ParticleSwarm

Direct search

  • NelderMead

Quasi-Newton Line search

  • DBFGS
  • BFGS
  • SR1
  • DFP
  • GradientDescent
  • LBFGS

Conjugate Gradient Descent

  • ConjugateGradient

Newton Line Search

  • Newton

Gradient based (no line search)

  • Adam
  • AdaMax

Acceleration methods

  • Anderson

Krylov

  • InexactNewton

BB style

  • BB
  • DFSANE

Simple bounds (box)

  • ParticleSwarm
  • ActiveBox

Custom solve

Newton methods generally accept a linsolve argument.

Preconditioning

Several methods accept nonlinear (left-)preconditioners. A preconditioner is provided as a function that has two methods: p(x) and p(x, P) where the first prepares and returns the preconditioner and the second is the signature for updating the preconditioner. If the preconditioner is constant, both method will simply return this preconditioner. A preconditioner is used in two contexts: in ldiv!(pgr, factorize(P), gr) that accepts a cache array for the preconditioned gradient pgr, the preconditioner P, and the gradient to be preconditioned gr, and in mul!(x, P, y). For the out-of-place methods (minimize as opposed to minimize!) it is sufficient to have \(P, gr) and *(P, y) defined.

Beware, chaotic gradient methods!

Some methods that might be labeled as acceleration, momentum, or spectral methods can exhibit chaotic behavior. Please keep this in mind if comparing things like DFSANE with similar implemenations in other software. It can give very different results given different compiler optimizations, CPU architectures, etc. See for example https://link.springer.com/article/10.1007/s10915-011-9521-3 .

TODO

Documented in each type's docstring including LineSearch, BFGS, ....

Abstract arrays!!! :| manifolds Use user norms Banded Jacobian AD nan hessian

line search should have a short curcuit for very small steps

Next steps

Mixed complementatiry SAMIN, BOXES Univariate!! IP Newotn Krylov Hessian

nlsolvers.jl's People

Contributors

ararslan avatar longemen3000 avatar mohamed82008 avatar pkofod avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

nlsolvers.jl's Issues

Change direction for updating x0

Just allow the user to pass it in via the caches and update that according to x0. If x0 and the cache array is the same, then it's just a simple inplace update.

ProjectedNewton

Big oops, B0 is hardcoded to be [1.0 0.0; 0.0 1.0]. fixed

should be possible to input B0

Improve BrentMin edge handling

#38 made sure that edges were evaluated. It might be better to use fa, fx, fb to set (v,fv)), (w,fw)`, and fx instead of only x and fx.

`NLSolvers.converged(ci::ConvergenceInfo)`

i'm rewiewing some code that uses optim, and it uses the x_converged, g_converged and f_converged fields of OptimizationResults. at the moment, there are some functions with the same name, but not defined on the ConvergenceInfo object. a mockup (based on show(io,mime,ci::ConvergenceInfo) would be the following:

x_converged(ci::NLSolvers.ConvergenceInfo) = false
f_converged(ci::NLSolvers.ConvergenceInfo) = false
g_converged(ci::NLSolvers.ConvergenceInfo) = false

function x_converged(ci::NLSolvers.ConvergenceInfo{<:NLSolvers.NelderMead,<:Any,<:NLSolvers.OptimizationOptions})
    return ci.info.ρs <= ci.options.x_abstol
end

function x_converged(ci::NLSolvers.ConvergenceInfo{<:NLSolvers.SimulatedAnnealing,<:Any,<:NLSolvers.OptimizationOptions})
    #don't know what to do do here, simulated annealing does not "converge" in the typical sense
    return true
end

function x_converged(ci::NLSolvers.ConvergenceInfo{<:Any,<:Any,<:NLSolvers.OptimizationOptions})
    info = ci.info
    opt = ci.options
    x_abstol = haskey(info, :ρs) && (info.ρ <= opt.x_abstol)
    x_reltol = haskey(info, :ρs) && (info.ρs/info.ρx <= opt.x_reltol)
    return x_abstol || x_reltol
end

function f_converged(ci::NLSolvers.ConvergenceInfo{<:Any,<:Any,<:NLSolvers.OptimizationOptions})
    info = ci.info
    opt = ci.options
    f_limit = !isfinite(opt.f_limit) || (info.minimum <= opt.f_limit)
    f_abstol = haskey(info, :fx) && (abs(info.fx - info.minimum) <= opt.f_abstol)
    f_reltol = haskey(info, :fx) && (abs((info.fx - info.minimum)/info.fx) <= opt.f_reltol)
    return f_limit || f_abstol || f_reltol
end

function f_converged(ci::NLSolvers.ConvergenceInfo{<:Any,<:Any,<:NLSolvers.NEqOptions})
    ρF = norm(ci.info.best_residual, Inf)
    #ρFz = norm(ci.info.solution, 2)
    f_abstol = ρF <= opt.f_abstol
    return f_abstol
end

function g_converged(ci::NLSolvers.ConvergenceInfo{<:Any,<:Any,<:NLSolvers.OptimizationOptions})
    info = ci.info
    opt = ci.options
    if haskey(info, :∇fz)
        ρ∇f = opt.g_norm(info.∇fz)
        g_abstol = ρP<=opt.g_abstol
        g_reltol = ρ∇f/info.∇f0<=opt.g_reltol
        if haskey(info, :prob) && hasbounds(info.prob)
            ρP = opt.g_norm(
                info.solution .- clamp.(info.solution .- info.∇fz, info.prob.bounds...),
            )
            gp_abstol = ρP <= opt.g_abstol
        else
            gp_abstol = false
        end
    else
        g_abstol =  false
        g_reltol = false
    end
    return g_abstol || g_reltol || gp_abstol
end


function converged(ci::NLSolvers.ConvergenceInfo)
    opt = ci.options
    info = ci.info
    conv_flags = x_converged(ci) || f_converged(ci) || g_converged(ci)
    if haskey(info, :Δ)
        Δmin = ci.solver.Δupdate.Δmin isa Nothing ? 0 : ci.solver.Δupdate.Δmin
        Δ = info.Δ <= Δmin
        Δ_finite = isfinite(info.Δ)
    else
        Δ = false
        Δ_finite = true
    end
    x = x_converged(ci)
    f = f_converged(ci)
    g = g_converged(ci)
    #finite flags
    x_finite = all(isfinite,NLSolvers.solution(ci))
    conv_flags = f || x || g || Δ
    finite_flags = x_finite && Δ_finite #  && g_finite && f_finite
    return conv_flags && finite_flags
end

what is missing is a general way to calculate the finiteness of the other stopping criteria, to address JuliaNLSolvers/Optim.jl#997

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

t0 not defined in Anderson

on NLSolvers 0.3, when calling nlsolve(prob,Anderson,NEqOptions(maxiter=0)), the package complains about t0 not defined. it seems to be fixed on the master branch

Setup with LinearSolve.jl

I'm trying to get more libraries using the LinearSolve.jl interface

http://linearsolve.sciml.ai/dev/

It's pretty complete now. The suggested form is that you use the caching interface so that it can store factorizations and iterative solver inits (http://linearsolve.sciml.ai/dev/tutorials/caching_interface/). The suggested algorithm form is to just have the user pass the type, like Newton(; linsolve=KLU()) which would then switch over to using SuiteSparse.KLU and accelerate sparse matrix factorizations for unstructured cases over UMFPACK. http://linearsolve.sciml.ai/dev/solvers/solvers/#SuiteSparse.jl

callbacks

i was looking and comparing the code with Optim.jl, and one of the only missing features is the ability to provide a callback function to short-circuit procedure

define `upto_gradient` for `MeritObjective`

something like this?

function NLSolvers.upto_gradient(meritobj::NLSolvers.MeritObjective, ∇f, x)
    neq = meritobj.prob
    G = neq.R.F(∇f, x)
    F =  (norm(G)^2) / 2
    return F,G
end

with that, NLSolvers.solve(prob,x0,LineSearch(Newton(),HZAW())) seems to work
EDIT: no it doesn't. but at least it hits an error in HZAW instead:
MethodError: no method matching isfinite(::NamedTuple{(:ϕ, :Fx), Tuple{Float64, StaticArrays.MVector{2, Float64}}})

while !isfinite(φc) && iter <= maxiter

Base.summary on Newton with custom linsolve just shows the type

when testing for cholesky in 9, i found that is was easier to suply the linsolve via Newton(;linsolve) than modify the code each time. but for custom linsolve, the summary prints just the type, instead of "Newton with user supplied Linsolve" . the error is here:

summary(::Newton{<:Direct, Any}) = "Newton's method with user supplied linsolve"

it dispatches on Any, instead of <:Any.
on the other hand, does it make sense to add a cholesky linsolve ?. i have this:

function cholesky_linsolve(d,B,∇f)
    cholesky!(Positive, B)
    Bchol = Cholesky(B,'L',0)
    d .=  Bchol\∇f
end

function cholesky_linsolve(B,∇f)
    Bchol =cholesky(Positive, B)
    Bchol\∇f
end

Base.summary(::NLSolvers.Newton{<:Direct, typeof(cholesky_linsolve)}) = "Newton's method with Cholesky linsolve"

the only downside is that it does require a dependence on PositiveFactorizations

Scout option for PSO

Hi,

It will be great if we incorporate a scout option to the PSO algorithm, as proposed by https://link.springer.com/chapter/10.1007/978-3-319-11128-5_21 and https://academic.oup.com/jcde/article/6/2/129/5732336

The scout option helps the algorithm to regenerate particles that are stuck (cannot update their Pbest for some iterations). and hence allows deeper exploration of the space.

This issue is mainly a remainder for me, as I intend to eventually code it myself and prepare a PR ;)

add `f_abstol` convergence measures for NEqProblem

on some testing:

sol = Results of solving non-linear equations

* Algorithm:
  Newton's method with default linsolve with backtracking (no interp)

* Candidate solution:
  Final residual 2-norm:      2.09e-09
  Final residual Inf-norm:    1.29e-09

  Initial residual 2-norm:    7.91e-05
  Initial residual Inf-norm:  4.92e-05

* Convergence measures
  |F(x')|               = 1.29e-09 <= 0.00e+00 (false)

* Work counters
  Seconds run:   1.41e-01
  Iterations:    1

i know that the non linear solving converged via f_abstol, but that is not shown in the convergence measures. seems that the keys don't correspond to the actual keys used in non-linear solving (ρF0,ρ2F0,ρs) ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.