sciml / optimization.jl Goto Github PK

Mathematical Optimization in Julia. Local, global, gradient-based and derivative-free. Linear, Quadratic, Convex, Mixed-Integer, and Nonlinear Optimization in one simple, fast, and differentiable interface.

Home Page: https://docs.sciml.ai/Optimization/stable/

License: MIT License

Julia 100.00%

optimization local-optimization global-optimization automatic-differentiation algorithmic-differentiation sciml scientific-machine-learning hacktoberfest julia mixed-integer-programming

optimization.jl's Introduction

Optimization.jl

Optimization.jl is a package with a scope that is beyond your normal global optimization package. Optimization.jl seeks to bring together all of the optimization packages it can find, local and global, into one unified Julia interface. This means, you learn one package and you learn them all! Optimization.jl adds a few high-level features, such as integrating with automatic differentiation, to make its usage fairly simple for most cases, while allowing all of the options in a single unified interface.

Installation

Assuming that you already have Julia correctly installed, it suffices to import Optimization.jl in the standard way:

using Pkg
Pkg.add("Optimization")

The packages relevant to the core functionality of Optimization.jl will be imported accordingly and, in most cases, you do not have to worry about the manual installation of dependencies. Below is the list of packages that need to be installed explicitly if you intend to use the specific optimization algorithms offered by them:

OptimizationBBO for BlackBoxOptim.jl
OptimizationEvolutionary for Evolutionary.jl (see also this documentation)
OptimizationGCMAES for GCMAES.jl
OptimizationMOI for MathOptInterface.jl (usage of algorithm via MathOptInterface API; see also the API documentation)
OptimizationMetaheuristics for Metaheuristics.jl (see also this documentation)
OptimizationMultistartOptimization for MultistartOptimization.jl (see also this documentation)
OptimizationNLopt for NLopt.jl (usage via the NLopt API; see also the available algorithms)
OptimizationNOMAD for NOMAD.jl (see also this documentation)
OptimizationNonconvex for Nonconvex.jl (see also this documentation)
OptimizationQuadDIRECT for QuadDIRECT.jl
OptimizationSpeedMapping for SpeedMapping.jl (see also this documentation)

Tutorials and Documentation

For information on using the package, see the stable documentation. Use the in-development documentation for the version of the documentation, which contains the unreleased features.

Examples

using Optimization
rosenbrock(x, p) = (p[1] - x[1])^2 + p[2] * (x[2] - x[1]^2)^2
x0 = zeros(2)
p = [1.0, 100.0]

prob = OptimizationProblem(rosenbrock, x0, p)

using OptimizationOptimJL
sol = solve(prob, NelderMead())

using OptimizationBBO
prob = OptimizationProblem(rosenbrock, x0, p, lb = [-1.0, -1.0], ub = [1.0, 1.0])
sol = solve(prob, BBO_adaptive_de_rand_1_bin_radiuslimited())

Note that Optim.jl is a core dependency of Optimization.jl. However, BlackBoxOptim.jl is not and must already be installed (see the list above).

Warning: The output of the second optimization task (BBO_adaptive_de_rand_1_bin_radiuslimited()) is currently misleading in the sense that it returns Status: failure (reached maximum number of iterations). However, convergence is actually reached and the confusing message stems from the reliance on the Optim.jl output struct (where the situation of reaching the maximum number of iterations is rightly regarded as a failure). The improved output struct will soon be implemented.

The output of the first optimization task (with the NelderMead() algorithm) is given below:

* Status: success

* Candidate solution
   Final objective value:     3.525527e-09

* Found with
   Algorithm:     Nelder-Mead

* Convergence measures
   √(Σ(yᵢ-ȳ)²)/n ≤ 1.0e-08

* Work counters
   Seconds run:   0  (vs limit Inf)
   Iterations:    60
   f(x) calls:    118

We can also explore other methods in a similar way:

using ForwardDiff
f = OptimizationFunction(rosenbrock, Optimization.AutoForwardDiff())
prob = OptimizationProblem(f, x0, p)
sol = solve(prob, BFGS())

For instance, the above optimization task produces the following output:

* Status: success

* Candidate solution
   Final objective value:     7.645684e-21

* Found with
   Algorithm:     BFGS

* Convergence measures
   |x - x'|               = 3.48e-07 ≰ 0.0e+00
   |x - x'|/|x'|          = 3.48e-07 ≰ 0.0e+00
   |f(x) - f(x')|         = 6.91e-14 ≰ 0.0e+00
   |f(x) - f(x')|/|f(x')| = 9.03e+06 ≰ 0.0e+00
   |g(x)|                 = 2.32e-09 ≤ 1.0e-08

* Work counters
   Seconds run:   0  (vs limit Inf)
   Iterations:    16
   f(x) calls:    53
   ∇f(x) calls:   53

prob = OptimizationProblem(f, x0, p, lb = [-1.0, -1.0], ub = [1.0, 1.0])
sol = solve(prob, Fminbox(GradientDescent()))

The examples clearly demonstrate that Optimization.jl provides an intuitive way of specifying optimization tasks and offers a relatively easy access to a wide range of optimization algorithms.

optimization.jl's People

Contributors

Stargazers

Watchers

optimization.jl's Issues

ReverseDiff over Zygote to differentiate neural network using GalacticOptim - Dimension mismatch

Hi All, I am struggling a bit with using GalacticOptim.AutoReverseDiff to find the gradients over a loss function which internally calculates a gradient using Zygote. Running ForwardDiff over the function works fine but is quite slow. I have created a mwe as seen below along with the error message. Any help would be appreciated to get around this or find an alternative approach.

using Zygote
using GalacticOptim
using Flux
using LinearAlgebra
using Statistics
using StatsBase

l1 = Dense(2, 10, tanh)
l2 = Dense(10, 4, identity)
network = Chain(l1, l2)
pars = Flux.params(network)
vec_pars, re = Flux.destructure(network)

function loss(x, p)
    network = re(x)
    u = Zygote.gradient(d -> sum(network(d)), p)[1]
    sum(u)
end

data = rand(2, 1000)
loss(vec_pars, data)

adtype = GalacticOptim.AutoReverseDiff()
optf = GalacticOptim.OptimizationFunction(loss, adtype)
optprob = GalacticOptim.OptimizationProblem(optf, vec_pars, data)

result_ad = GalacticOptim.solve(optprob, ADAM(0.001), maxiters=1)

Here is the error message:
DimensionMismatch("array could not be broadcast to match destination") check_broadcast_shape at broadcast.jl:520 [inlined] check_broadcast_axes at broadcast.jl:523 [inlined] check_broadcast_axes at broadcast.jl:527 [inlined] instantiate at broadcast.jl:269 [inlined] materialize! at broadcast.jl:894 [inlined] materialize! at broadcast.jl:891 [inlined] apply!(o::ADAM, x::Vector{Float32}, Δ::Vector{Float64}) at optimisers.jl:179 update!(opt::ADAM, x::Vector{Float32}, x̄::Vector{Float64}) at train.jl:23 update!(opt::ADAM, xs::Params, gs::Zygote.Grads) at train.jl:29 __solve(prob::OptimizationProblem{true, OptimizationFunction{true, GalacticOptim.AutoReverseDiff, typeof(loss), Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Vector{Float32}, Matrix{Float64}, Nothing, Nothing, Nothing, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, opt::ADAM, data::Base.Iterators.Cycle{Tuple{GalacticOptim.NullData}}; maxiters::Int64, cb::Function, progress::Bool, save_best::Bool, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}) at solve.jl:106 __solve at solve.jl:66 [inlined] __solve at solve.jl:66 [inlined] #solve#468 at solve.jl:3 [inlined] (::CommonSolve.var"#solve##kw")(::NamedTuple{(:maxiters,), Tuple{Int64}}, ::typeof(solve), ::OptimizationProblem{true, OptimizationFunction{true, GalacticOptim.AutoReverseDiff, typeof(loss), Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Vector{Float32}, Matrix{Float64}, Nothing, Nothing, Nothing, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, ::ADAM) at solve.jl:3 top-level scope at MWE.jl:27 eval at boot.jl:360 [inlined]

Pre-compile issues

Hi,

I'm having trouble pre-compiling GalalacticOptim (as well as DiffEqFlux). They both push similar Stacktrace errors as below:

Stacktrace:
[1] error(s::String)
@ Base ./error.jl:33
[2] compilecache(pkg::Base.PkgId, path::String, internal_stderr::Base.TTY, internal_stdout::Base.TTY)
@ Base ./loading.jl:1360
[3] compilecache(pkg::Base.PkgId, path::String)
@ Base ./loading.jl:1306
[4] _require(pkg::Base.PkgId)
@ Base ./loading.jl:1021
[5] require(uuidkey::Base.PkgId)
@ Base ./loading.jl:914
[6] require(into::Module, mod::Symbol)
@ Base ./loading.jl:901

This is after a fresh julia 1.6 install with just the following in the project:

Status ~/.julia/environments/v1.6/Project.toml
[aae7a2af] DiffEqFlux v1.36.1
[41bf760c] DiffEqSensitivity v6.43.2
[0c46a032] DifferentialEquations v6.16.0
[587475ba] Flux v0.12.1
[a75be94c] GalacticOptim v1.1.0
[7073ff75] IJulia v1.23.2
[429524aa] Optim v1.3.0
[91a5bcdd] Plots v1.11.2

Any ideas as to what could be going wrong?

Thanks!

NLopt optimiser does not call objective function

As discussed here, when using NLopt optimisers it appears that no function calls are being made.

This is not being picked up in the tests, because the default objective value assumed by the optimiser, before any objective calls, appears to be 0 (I think, not sure).

Reproducible example:

using GalacticOptim, Optim, Test, NLopt

# this part works
i = 0
loudrosenbrock(x, p) =  begin 
    println("Function Evaluations: $i")
    global i += 1
    (p[1] - x[1])^2 + p[2] * (x[2] - x[1]^2)^2
end
x0 = zeros(2)
_p  = [1.0, 100.0]

l1 = loudrosenbrock(x0, _p)
prob = OptimizationProblem(loudrosenbrock, x0, p=_p)
sol = solve(prob, SimulatedAnnealing())
@test 10*sol.minimum < l1



# this part appears to work, but doesn't really
loudrosenbrock(x, p=nothing) = begin
    println("Function Evaluations: $i")
    global i += 1
    (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2
end

i = 0
l1 = loudrosenbrock(x0)
prob = OptimizationProblem(rosenbrock, x0)
sol = solve(prob, Opt(:LN_BOBYQA, 2)) 
@test 10*sol.minimum < l1
true_min = loudrosenbrock(sol.minimizer)
@test 10*true_min < l1

MultistartOptimization.jl missing from the documentation

It goes in the global unconstrained optimization page.

NLopt algorithms missing from the documentation

NLopt has a ton, and it should be on almost every optimizer page.

Add objective sense into the front-end

See http://www.juliaopt.org/MathOptInterface.jl/dev/apireference/#MathOptInterface.ObjectiveSense

The default should be min_sense and you might even want to only implement that in the first iteration, but you want to do the API for this correctly from the beginning.

certainly MIN_SENSE and MAX_SENSE are necessary. The last one there is FEASIBILITY_SENSE is for when you are using an optimizer to solve a feasibility problem, but not sure if it is only used in problem transformation or not. The feasibility approach is to solve a system of equations by using the nonlinear optimizer/etc. to find a feasible point for the constraints where you don't care about the objective value.

Regardless, it means that you wouldn't want to design as a min vs max toggle in case feasibility or others should come later.

Tutorial on some advanced MTK things

Add NLopt optimizers to documentation

Feature request: Differentiable optimizers

It would be nice if this package contained rules to AD through optimization problems
https://link.springer.com/article/10.1007/s12532-012-0043-2
https://www.sciencedirect.com/science/article/pii/S2405896318327137

Fix withprogress

We need to do SciML/DiffEqFlux.jl@7e3cf7c here.

Add visualization tooling / plot recipes

FluxOptTools allows visualizing loss when using Optim.jl. There's also some integration between Optim and TensorBoardLogger. Neither of these integrations can be directly used when using GalacticOptim in place of Optim. It seems like a first step would be add similar integrations / recipes for GalacticOptim.

As a first pass, I got TensorBoardLogger to plot loss with the following callback:

function make_tb_cb(name="")
    logger = TBLogger(logdir, tb_overwrite)
    function callback(p, l)
        with_logger(logger) do
            @info "$(name)" loss=l
        end
        return false
    end
    return callback
end

maxiters=100
GalacticOptim.solve(prob, Optim.BFGS(); cb=make_tb_cb("BFGS-$(maxiters)"), maxiters=maxiters)

New return struct

As discussed with @mkg33 , the current return struct from Optim has some weird behavior and a lot of sentinel values so it would be best to make our own OptimizationSolution which is tailored towards our outputs.

Add support for save_best keyword with ADAM optimizer with sciml_train

If you use ADAM optimizer in sciml_train with save_best=true it does not work and just saves last parameter vector.

Support IPNewton from Optim

We would have to add constraints and bounds to OptimizationFunction and use it to store their gradients and other derivatives. It would need a separate _solve dispatch that uses the TwiceDifferentiableConstraints passed in to the optimize call

minibatch tutorial not working with Fminbox


function newtons_cooling(du, u, p, t)
    temp = u[1]
    k, temp_m = p
    du[1] = dT = -k*(temp-temp_m)
  end

function true_sol(du, u, p, t)
    true_p = [log(2)/8.0, 100.0]
    newtons_cooling(du, u, true_p, t)
end

function dudt_(u,p,t)
    ann(u, p).* u
end

cb = function (p,l,pred;doplot=false) #callback function to observe training
    display(l)
    # plot current prediction against data
    if doplot
      pl = scatter(t,ode_data[1,:],label="data")
      scatter!(pl,t,pred[1,:],label="prediction")
      display(plot(pl))
    end
    return false
end

u0 = Float32[200.0]
datasize = 30
tspan = (0.0f0, 1.5f0)

t = range(tspan[1], tspan[2], length=datasize)
true_prob = ODEProblem(true_sol, u0, tspan)
ode_data = Array(solve(true_prob, Tsit5(), saveat=t))

ann = FastChain(FastDense(1,8,tanh), FastDense(8,1,tanh))
pp = initial_params(ann)
prob = ODEProblem{false}(dudt_, u0, tspan, pp)

function predict_adjoint(fullp, time_batch)
    Array(solve(prob, Tsit5(), p = fullp, saveat = time_batch))
end

function loss_adjoint(fullp, batch, time_batch)
    pred = predict_adjoint(fullp,time_batch)
    sum(abs2, batch .- pred), pred
end


k = 10
train_loader = Flux.Data.DataLoader((ode_data, t), batchsize = k)

numEpochs = 300
l1 = loss_adjoint(pp, train_loader.data[1], train_loader.data[2])[1]

# optfun = OptimizationFunction((θ, p, batch, time_batch) -> loss_adjoint(θ, batch, time_batch), GalacticOptim.AutoZygote())
# optprob = OptimizationProblem(optfun, pp)
# using IterTools: ncycle
# res1 = GalacticOptim.solve(optprob, ADAM(0.05), ncycle(train_loader, numEpochs), cb = cb, maxiters=numEpochs)

lower = [-2f0 for _ in 1:length(pp)]
upper = lower .* -1
optfun = OptimizationFunction((θ, p, batch, time_batch) -> loss_adjoint(θ, batch, time_batch), GalacticOptim.AutoZygote())
optprob = OptimizationProblem(optfun, pp, lb=lower, ub=upper)
res2 = GalacticOptim.solve(optprob, Fminbox(SimulatedAnnealing()), ncycle(train_loader, numEpochs), cb = cb, maxiters=100)

throws


MethodError: no method matching (::var"#21#22")(::Vector{Float32}, ::SciMLBase.NullParameters)
Closest candidates are:
  (::var"#21#22")(::Any, ::Any, !Matched::Any, !Matched::Any) at /home/david/github/julia/LTC.jl/src/go_issue.jl:64
in eval at base/boot.jl:360 
in top-level scope at julia/LTC.jl/src/go_issue.jl:66
in  at SciMLBase/EFFG1/src/solve.jl:3
in var"#solve#468" at SciMLBase/EFFG1/src/solve.jl:3
in  at GalacticOptim/JnLwV/src/solve.jl:201
in var"#__solve#32" at GalacticOptim/JnLwV/src/solve.jl:241
in optimize at Optim/uwNqi/src/multivariate/solvers/constrained/fminbox.jl:383
in value_gradient! at Optim/uwNqi/src/multivariate/solvers/constrained/fminbox.jl:94
in value_gradient! at NLSolversBase/geyh3/src/interface.jl:73
in gradient!! at NLSolversBase/geyh3/src/interface.jl:63
in  at GalacticOptim/JnLwV/src/function.jl:106
in gradient at Zygote/6HN9x/src/compiler/interface.jl:58
in pullback at Zygote/6HN9x/src/compiler/interface.jl:40
in _pullback at Zygote/6HN9x/src/compiler/interface.jl:34
in _pullback at Zygote/6HN9x/src/compiler/interface2.jl
in _pullback at GalacticOptim/JnLwV/src/function.jl:106 
in _pullback at ZygoteRules/OjfTt/src/adjoint.jl:57 
in adjoint at Zygote/6HN9x/src/lib/lib.jl:191 
in _apply at base/boot.jl:804
in _pullback at Zygote/6HN9x/src/compiler/interface2.jl
in _pullback at GalacticOptim/JnLwV/src/function.jl:104 
in _pullback at ZygoteRules/OjfTt/src/adjoint.jl:57
in adjoint at base/none
in adjoint at Zygote/6HN9x/src/lib/lib.jl:191 
in _apply at base/boot.jl:804
in _pullback at Zygote/6HN9x/src/compiler/interface2.jl
in _pullback at SciMLBase/EFFG1/src/problems/basic_problems.jl:107 
in _pullback at ZygoteRules/OjfTt/src/adjoint.jl:57 
in adjoint at Zygote/6HN9x/src/lib/lib.jl:191 
in _apply at base/boot.jl:804
in _pullback at Zygote/6HN9x/src/compiler/interface2.jl:9
in macro expansion at Zygote/6HN9x/src/compiler/interface2.jl

The loss functions does not get called with correct arguments

Add OptimizationProblem constructor for `f::OptimizationFunction`

So that fields can be just copied over. I also think we would need bounds and constraints as part of OptimizationFunction down the line so they would also be copied over easily as well.

kwargs only used for Optim.AbstractOptimizer

We need to pass options to Optim while using Fminbox. While sciml_train accepts kwargs for Optim.AbstractOptimizer and passes them on, this does not work with Optim.AbstractConstrainedOptimizer

Add an abstraction for chaining optimizers

It seems like we often have code like:

result_neuralode = DiffEqFlux.sciml_train(loss_neuralode, prob_neuralode.p,
                                          ADAM(0.05), cb = callback,
                                          maxiters = 300)

result_neuralode2 = DiffEqFlux.sciml_train(loss_neuralode,
                                           result_neuralode.minimizer,
                                           LBFGS(),
                                           cb = callback,
                                           allow_f_increases = false)

Here, the result of optimizing with ADAM is further optimized via LBFGS.

As a first pass, I imagine a new OptimizerChain type and sciml_train entrypoint.

chained_optimizer = OptimizerChain([
    ADAM(0.05) => (cb=adam_callback, maxiters=300),
    LBFGS() => (cb=lbfgs_callback, allow_f_increases=false)
])
result_neuralode = DiffEqFlux.sciml_train(loss_neuralode, prob_neuralode.p, chained_optimizer)

That sciml_train method could then do (something like) a left fold over the list of optimizers + kwargs.

Does this interface seem reasonable? Is any information missing?

Stochastic results

using GalacticOptim
using Optim

rosenbrock(x, p) =  (p[1] - x[1])^2 + p[2] * (x[2] - x[1]^2)^2
x0 = [0.25, 0.25]
_p  = [1.0, 100.0]
function con2_c(x,p)
    [x[1]^2 + x[2]^2, x[2]*sin(x[1])-x[1]]
end
optprob = OptimizationFunction(rosenbrock, GalacticOptim.AutoForwardDiff();cons= con2_c, num_cons = 2)
prob = OptimizationProblem(optprob, x0, _p, lb = [-Inf, -Inf], ub = [Inf, Inf], lcons = [-Inf,0.0], ucons = [0.5^2,0.0])
sol = solve(prob, IPNewton())
sol.minimizer
for k = 1:1111
    sol = solve(prob, IPNewton())
    println(sol.minimizer)
end

...
[0.285082026303712, 0.22757015261128585]
[0.25, 0.25]
[0.2537219972510135, -0.0006804993127533798]
[0.3233798631411377, 0.0963176197016151]
[0.1579465610290538, 0.006761155169101821]
[0.1136151201147443, 0.006142642686727401]
[0.1447326082975161, 0.006418734928336663]
[0.32810481704926825, 0.09710162352296237]
[0.13271929644578603, -0.00869719926662229]
[0.32637670014523734, 0.12394740549890276]
[0.07792233541448301, -0.012828064426417282]
[0.25, 0.25]
[0.25, 0.25]
[0.3253530660614871, 0.10377836213936834]
[0.25476912756897874, -0.04345904016365443]
[0.25, 0.25]
...

Roadmap

We're already well into this, so here's a roadmap:

Get the basic problem/solve interface done. Depend on DiffEqBase. Use @requires for each algorithm that's not Optim. Let Optim be the default. Basically port over all of what we have for sciml_train. We can have a working version of OptimizationProblem and OptimizationFunction here for now, but eventually (before release) we'll want to upstream it to DiffEqBase so that way MTK can easily target it.
Lazy instantiation of autodiff. When we hit solve, we instantiate the autodiff functions, only the ones that we need, and use them. This makes it thread-safe, and the user can do something at the higher level if they want different behavior. This makes us compatible with Ensembles and such.
Get support for constraints where possible.
Flesh out the options and solver outputs.

cons_j should be using the jvp form

v = zero(x)
v[j] = 1
input .= Dual{DeivVecTag}.(x, v)
jthjaccolumn = f(input)

for computing the jth column without computing the whole Jacobian.

AutoModelingToolkit not working

Following the example from the tutorials, I have tried to run:

rosenbrock(x,p) =  (p[1] - x[1])^2 + p[2] * (x[2] - x[1]^2)^2
OptimizationFunction(rosenbrock,GalacticOptim.AutoModelingToolkit())

And I get : ERROR: UndefVarError: AutoModelingToolkit not defined

PS: This is becoming the most exciting optimization package in Julia. I can't wait for it to include optimization with constraints using IPNewton

Tutorial on using GPUs

Improve show methods for OptimizationFunction and OptimizationProblem

Re-enable BlackBoxOptim test

Right now it needed to be disabled since that package is completely unmaintained. Hopefully it comes back, in which case it can be re-enabled. It might be a good idea to fork it.

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Tutorials in docs

The docs don't have any tutorials, for a start we should just add the test there and iterate from there.

Uniform solution type

In the SciML interfaces, all problems and solutions live in SciMLBase (as of... yesterday 😄).

They satisfy the solution interface too:

https://github.com/SciML/SciMLBase.jl/blob/master/src/solutions/solution_interface.jl

I think we should get this return type into there, and make it follow the results of the other solutions, namely:

The return array is named u: that would be the minimizer here. To be non-breaking it could be good to alias .minimizer to .u with a getproperty overload.
We can have a special .minimum
It should have overloads to the build_solution function
It can have .original for keeping the original output struct of the other optimizers
It should be an AbstractNoTimeSolution?
It might want special overloads to printing and such, but follow the same general flow.
It should interpret things into the standard retcodes.
Matching the ODE solver, stuff like iterations might make sense in a .stats OptimizationStatistics.

So I think a lot of https://github.com/SciML/GalacticOptim.jl/blob/master/src/solve.jl#L2-L28 should be copied over to be the basis of this change, but just a few tweaks to bring it in line with the more general AbstractSolution interface.

add Flux optimisers

ChainRules support

The reason for the p in the optimization problem definition is to allow for ChainRules support. Specifically:

https://arxiv.org/abs/1804.05098
https://web.stanford.edu/~boyd/papers/pdf/diff_cvxpy.pdf
https://link.springer.com/article/10.1007/s12532-012-0043-2
https://link.springer.com/article/10.1007/BF02055196
https://arxiv.org/abs/1810.13400

https://link.springer.com/article/10.1007/s12532-012-0043-2 equation (3) might be the clearest.

Register

@JuliaRegistrator register()

@JuliaRegistrator register subdir=lib/OptimizationBBO
@JuliaRegistrator register subdir=lib/OptimizationCMAEvolutionStrategy
@JuliaRegistrator register subdir=lib/OptimizationEvolutionary
@JuliaRegistrator register subdir=lib/OptimizationFlux
@JuliaRegistrator register subdir=lib/OptimizationGCMAES
@JuliaRegistrator register subdir=lib/OptimizationMetaheuristics
@JuliaRegistrator register subdir=lib/OptimizationMOI
@JuliaRegistrator register subdir=lib/OptimizationMultistartOptimization
@JuliaRegistrator register subdir=lib/OptimizationNLopt
@JuliaRegistrator register subdir=lib/OptimizationNOMAD
@JuliaRegistrator register subdir=lib/OptimizationOptimJL
@JuliaRegistrator register subdir=lib/OptimizationOptimisers
@JuliaRegistrator register subdir=lib/OptimizationPolyalgorithms
@JuliaRegistrator register subdir=lib/OptimizationPRIMA
@JuliaRegistrator register subdir=lib/OptimizationQuadDIRECT
@JuliaRegistrator register subdir=lib/OptimizationSpeedMapping

Maxiter does not except scientific e notation

I have just started using GalacticOptim and have to say it's a great package. I wanted to give a quick feedback.

The maxiter kwarg for the solve function behaves sliglty inconsitent when defining iterations by 1e5. When using BBO() it runs fine but fails when creating the object to return and when using Optim.jl optimisers it fails due a Float64 warning.

Not sure if you want the interface to accept e notation (personally I think it would be good idea) but it should be consitent.

Linking Nonconvex.jl

I'm trying to write some code to include methods of https://github.com/mohamed82008/Nonconvex.jl to this repo. Will that be a welcomed PR?
Thanks!

Expand docstrings

We should have verbose docstrings and more examples in the Readme to get users to start shifting to using the package.

Multi objective optimization

The purpose of Multi Objective Optimization is to find to find the efficient (pareto) frontier. If the goal is to wrap all optimizer it might be good to think of an interface to expose such functionality too.
In the opposite direction the presence of such an interface encourage optimizer to implement such a functionality.
It might be possible to convince normal optimizer to do this task using constraints, although they might be less efficient than an algorithm made for that purpose.

Iterations, f(x) calls, ∇f(x) calls incorrect

When using sciml_train with data inputs, the iterations numbers are incorrect.

import DifferentialEquations: Tsit5
import Flux: ADAM
import DiffEqFlux: sciml_train, ODEProblem, solve

function f!(dz, z, params, t)
    dz[:] .= 0.0
end

function loss(p, _data)
    u0 = zeros(5)
    odeprob = ODEProblem(f!, u0, (0, 1.0), p)
    rollout = solve(odeprob, Tsit5(), u0 = u0, p = p)
    Array(rollout)[1, end]
end

res = sciml_train(
    loss,
    rand(3),
    ADAM(),
    [1 2 3 4],
    cb = (params, loss_val) -> begin
        @show loss_val
        false
    end,
)

julia> include("work_counters_bug.jl")
loss_val = 0.0
loss_val = 0.0
loss_val = 0.0
loss_val = 0.0
Training 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| Time: 0:00:04
 * Status: failure (reached maximum number of iterations)

 * Candidate solution
    Minimizer: [6.36e-01, 9.60e-01, 7.74e-01]
    Minimum:   0.000000e+00

 * Found with
    Algorithm:     ADAM
    Initial Point: [6.36e-01, 9.60e-01, 7.74e-01]

 * Convergence measures
    |x - x'|               = NaN ≰ 0.0e+00
    |x - x'|/|x'|          = NaN ≰ 0.0e+00
    |f(x) - f(x')|         = NaN ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = NaN ≰ 0.0e+00
    |g(x)|                 = NaN ≰ 0.0e+00

 * Work counters
    Seconds run:   5  (vs limit Inf)
    Iterations:    9223372036854775807
    f(x) calls:    9223372036854775807
    ∇f(x) calls:   9223372036854775807

So it looks like it's set to some kind of max value instead of 4.

Setup BlackBoxOptim.jl and Evolutionary.jl with sciml_train

Hook sciml_train into MOI

Example of how to do this: https://github.com/JuliaOpt/MathOptInterface.jl/blob/cdf87e9376a086fb80ea731cece453f5a92878ca/src/Test/nlp.jl#L29

This would give us compatibility with Knitro and IPOPT

Optimization summary is gone?

DiffEqFlux.sciml_train used to print an informative summary at the end, like this

 * Status: failure

 * Candidate solution
    Final objective value:

 * Found with
    Algorithm:     BFGS

 * Convergence measures
...

 * Work counters
    Seconds run:   138  (vs limit Inf)
    Iterations:    78
    f(x) calls:    100
    ∇f(x) calls:   100

but it is gone in the current release. Is it behind an option, or is it completely gone?

(I initially opened this issue in DiffEqFlux but I was told to report it here SciML/DiffEqFlux.jl#513)

Optimizer documentation style

I'm not sure we have the optimizer documentation style down quite yet. It has all of the right elements, but I think it's quite busy. So let's discuss this a bit. Currently we have:

Flux.Optimise.ADAM: ADAM optimizer
- solve(problem, ADAM(η, β::Tuple))
- η is the learning rate
- β::Tuple is the decay of momentums
- defaults to: η = 0.001, β::Tuple = (0.9, 0.999)

My suggestion is to cut this down a bit, i.e.

Flux.Optimise.ADAM(η=0.001, β=(0.9, 0.999)): The ADAM method
- η is the learning rate
- β::Tuple` is the decay of momentums

thoughts?

Can't get NLopt to work

I'm using latest version 0.4.1 and julia v1.6

I'm trying this out for the first time and am having some issues with getting NLopt to work.

using GalacticOptim, Optim
solve = GalacticOptim.solve
rosenbrock(x,p) =  (p[1] - x[1])^2 + p[2] * (x[2] - x[1]^2)^2
x0 = zeros(2)
p  = [1.0,100.0]

using NLopt
f = OptimizationFunction(rosenbrock, GalacticOptim.AutoForwardDiff())
prob = OptimizationProblem(f, x0, p)
sol3 = solve(prob, NLopt.LD_LBFGS)

Training 100%|████████████████████████████████████████████████████████████████████████████| Time: 0:00:00
ERROR: MethodError: no method matching apply!(::Algorithm, ::Vector{Float64}, ::Vector{Float64})
Closest candidates are:
  apply!(::ClipNorm, ::Any, ::Any) at /home/fredrikb/.julia/packages/Flux/05b38/src/optimise/optimisers.jl:598
  apply!(::ClipValue, ::Any, ::Any) at /home/fredrikb/.julia/packages/Flux/05b38/src/optimise/optimisers.jl:587
  apply!(::Nesterov, ::Any, ::Any) at /home/fredrikb/.julia/packages/Flux/05b38/src/optimise/optimisers.jl:102
  ...
Stacktrace:
  [1] update!(opt::Algorithm, x::Vector{Float64}, x̄::Vector{Float64})
    @ GalacticOptim ~/.julia/packages/GalacticOptim/gFujJ/src/solve.jl:24
  [2] update!(opt::Algorithm, xs::Zygote.Params, gs::Vector{Float64})
    @ GalacticOptim ~/.julia/packages/GalacticOptim/gFujJ/src/solve.jl:32
  [3] macro expansion
    @ ~/.julia/packages/GalacticOptim/gFujJ/src/solve.jl:108 [inlined]
  [4] macro expansion
    @ ~/.julia/packages/ProgressLogging/BBN0b/src/ProgressLogging.jl:328 [inlined]
  [5] (::GalacticOptim.var"#7#10"{GalacticOptim.var"#8#11", Bool, Bool, OptimizationProblem{true, OptimizationFunction{true, GalacticOptim.AutoForwardDiff{nothing}, typeof(rosenbrock), Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Vector{Float64}, Vector{Float64}, Nothing, Nothing, Nothing, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, OptimizationFunction{true, GalacticOptim.AutoForwardDiff, typeof(rosenbrock), GalacticOptim.var"#113#129"{GalacticOptim.var"#111#127"{Vector{Float64}, GalacticOptim.var"#110#126"{OptimizationFunction{true, GalacticOptim.AutoForwardDiff{nothing}, typeof(rosenbrock), Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Vector{Float64}}, Int64}, GalacticOptim.var"#110#126"{OptimizationFunction{true, GalacticOptim.AutoForwardDiff{nothing}, typeof(rosenbrock), Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Vector{Float64}}}, GalacticOptim.var"#117#133"{GalacticOptim.var"#115#131"{Vector{Float64}, GalacticOptim.var"#110#126"{OptimizationFunction{true, GalacticOptim.AutoForwardDiff{nothing}, typeof(rosenbrock), Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Vector{Float64}}, Int64}, GalacticOptim.var"#110#126"{OptimizationFunction{true, GalacticOptim.AutoForwardDiff{nothing}, typeof(rosenbrock), Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Vector{Float64}}}, GalacticOptim.var"#119#135", Nothing, Nothing, Nothing}, Zygote.Params, Vector{Float64}})()
    @ GalacticOptim ~/.julia/packages/GalacticOptim/gFujJ/src/solve.jl:57
  [6] with_logstate(f::Function, logstate::Any)
    @ Base.CoreLogging ./logging.jl:491
  [7] with_logger
    @ ./logging.jl:603 [inlined]
  [8] maybe_with_logger(f::GalacticOptim.var"#7#10"{GalacticOptim.var"#8#11", Bool, Bool, OptimizationProblem{true, OptimizationFunction{true, GalacticOptim.AutoForwardDiff{nothing}, typeof(rosenbrock), Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Vector{Float64}, Vector{Float64}, Nothing, Nothing, Nothing, Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, OptimizationFunction{true, GalacticOptim.AutoForwardDiff, typeof(rosenbrock), GalacticOptim.var"#113#129"{GalacticOptim.var"#111#127"{Vector{Float64}, GalacticOptim.var"#110#126"{OptimizationFunction{true, GalacticOptim.AutoForwardDiff{nothing}, typeof(rosenbrock), Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Vector{Float64}}, Int64}, GalacticOptim.var"#110#126"{OptimizationFunction{true, GalacticOptim.AutoForwardDiff{nothing}, typeof(rosenbrock), Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Vector{Float64}}}, GalacticOptim.var"#117#133"{GalacticOptim.var"#115#131"{Vector{Float64}, GalacticOptim.var"#110#126"{OptimizationFunction{true, GalacticOptim.AutoForwardDiff{nothing}, typeof(rosenbrock), Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Vector{Float64}}, Int64}, GalacticOptim.var"#110#126"{OptimizationFunction{true, GalacticOptim.AutoForwardDiff{nothing}, typeof(rosenbrock), Nothing, Nothing, Nothing, Nothing, Nothing, Nothing}, Vector{Float64}}}, GalacticOptim.var"#119#135", Nothing, Nothing, Nothing}, Zygote.Params, Vector{Float64}}, logger::LoggingExtras.TeeLogger{Tuple{LoggingExtras.EarlyFilteredLogger{TerminalLoggers.TerminalLogger, GalacticOptim.var"#2#4"}, LoggingExtras.EarlyFilteredLogger{Logging.ConsoleLogger, GalacticOptim.var"#3#5"}}})
    @ GalacticOptim ~/.julia/packages/GalacticOptim/gFujJ/src/solve.jl:35

If I use

prob = OptimizationProblem(rosenbrock, x0, p, lb = [-1.0,-1.0], ub = [1.0,1.0])

I get a different error.

Speed up precompilation

I I try to use this package within mine, precompilation time increases 2 minutes.
Maybe some invalidations, or delaying actual compilation of packages not used? (If I only use Optim and NLsolve, only precompile those two, and avoid the rest).

Documenting call signatures for passed `grad`, `hes`, etc.

trial and error for this quickly became tedious - some pointers would be nice 😄

https://galacticoptim.sciml.ai/stable/API/optimization_function/#optfunction

Cheers!

Add MTK hook

As per the discussion on slack

just do the same thing as modelingtoolkitize
define symbolic variables to match the size of the inputs and parameters, trace them through the function, and generate the OptimizationSystem, then generate the OptimizationProblem from that.

Rosenbrock with non-linear constraints

I adapted the following GalacticOptim test, by making the radius of the constraint smaller:

using GalacticOptim
using Optim

x0 = zeros(2)
rosenbrock(x, p=nothing) =  (1 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2
cons= (x,p) -> [x[1]^2 + x[2]^2]
optprob = OptimizationFunction(rosenbrock, GalacticOptim.AutoForwardDiff();cons= cons, num_cons = 1)
prob = OptimizationProblem(optprob, x0, lcons = [-Inf], ucons = [0.25^2])
sol = solve(prob, IPNewton())
sol.minimizer
sqrt(cons(sol.minimizer,nothing)[1])

julia> sqrt(cons(sol.minimizer,nothing)[1])
0.2698675215506136

I adapted the same problem from the Optim documentation

fun(x) =  (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2
function fun_grad!(g, x)
g[1] = -2.0 * (1.0 - x[1]) - 400.0 * (x[2] - x[1]^2) * x[1]
g[2] = 200.0 * (x[2] - x[1]^2)
end
function fun_hess!(h, x)
h[1, 1] = 2.0 - 400.0 * x[2] + 1200.0 * x[1]^2
h[1, 2] = -400.0 * x[1]
h[2, 1] = -400.0 * x[1]
h[2, 2] = 200.0
end;
df = TwiceDifferentiable(fun, fun_grad!, fun_hess!, x0)
con_c!(c, x) = (c[1] = x[1]^2 + x[2]^2; c)
function con_jacobian!(J, x)
    J[1,1] = 2*x[1]
    J[1,2] = 2*x[2]
    J
end
function con_h!(h, x, λ)
    h[1,1] += λ[1]*2
    h[2,2] += λ[1]*2
end;
lx = Float64[]; ux = Float64[]
lc = [-Inf]; uc = [0.25^2]
dfc = TwiceDifferentiableConstraints(con_c!, con_jacobian!, con_h!,lx, ux, lc, uc)
res = optimize(df, dfc, x0, IPNewton())
res.minimizer
sqrt(cons(res.minimizer,nothing)[1])

julia> sqrt(cons(res.minimizer,nothing)[1])
0.24999999999999997

Why do results differ?

Consolidate time limit options

Switching between Optim and BBO it is a pain to manually change between time_limit and MaxTime for the solve options. Not everything should be made consistent, but that (and the showtrace/etc.) should.

Could not find cudnn

Trying to compile this package with Julia 1.5.3 on Windows 10 in vscode, I get this error. My CUDA version is 11.2, Driver Version 460.89.
I'm not even sure why it's looking for cuda, since I never specified any desire for using GPU.
Maybe something is wrong with my julia install.

[ Info: Precompiling GalacticOptim [a75be94c-b780-496d-a8a9-0878b188d577]
ERROR: LoadError: InitError: Could not find cudnn (cudnn64_80.dll, cudnn64_8.dll, cudnn_80.dll or cudnn_8.dll) in C:\Users\Andriy\.julia\artifacts\194b57428903281322b7e941d4ba7f744c01e0dd\bin
Stacktrace:
 [1] error(::String) at .\error.jl:33
 [2] artifact_library(::String, ::String, ::VersionNumber) at C:\Users\Andriy\.julia\packages\CUDA\YeS8q\deps\bindeps.jl:103
 [3] use_artifact_cudnn(::VersionNumber) at C:\Users\Andriy\.julia\packages\CUDA\YeS8q\deps\bindeps.jl:272
 [4] use_artifact_cuda() at C:\Users\Andriy\.julia\packages\CUDA\YeS8q\deps\bindeps.jl:186
 [5] __init_dependencies__() at C:\Users\Andriy\.julia\packages\CUDA\YeS8q\deps\bindeps.jl:365
 [6] __runtime_init__() at C:\Users\Andriy\.julia\packages\CUDA\YeS8q\src\initialization.jl:114
 [7] macro expansion at C:\Users\Andriy\.julia\packages\CUDA\YeS8q\src\initialization.jl:32 [inlined]
 [8] macro expansion at .\lock.jl:183 [inlined]
 [9] _functional(::Bool) at C:\Users\Andriy\.julia\packages\CUDA\YeS8q\src\initialization.jl:26
 [10] functional(::Bool) at C:\Users\Andriy\.julia\packages\CUDA\YeS8q\src\initialization.jl:19
 [11] functional at C:\Users\Andriy\.julia\packages\CUDA\YeS8q\src\initialization.jl:18 [inlined]
 [12] __init__() at C:\Users\Andriy\.julia\packages\Flux\q3zeA\src\Flux.jl:54
 [13] _include_from_serialized(::String, ::Array{Any,1}) at .\loading.jl:697
 [14] _require_search_from_serialized(::Base.PkgId, ::String) at .\loading.jl:782
 [15] _require(::Base.PkgId) at .\loading.jl:1007
 [16] require(::Base.PkgId) at .\loading.jl:928
 [17] require(::Module, ::Symbol) at .\loading.jl:923
 [18] include(::Function, ::Module, ::String) at .\Base.jl:380
 [19] include(::Module, ::String) at .\Base.jl:368
 [20] top-level scope at none:2
 [21] eval at .\boot.jl:331 [inlined]
 [22] eval(::Expr) at .\client.jl:467
 [23] top-level scope at .\none:3
during initialization of module Flux
in expression starting at C:\Users\Andriy\.julia\packages\GalacticOptim\8JANI\src\GalacticOptim.jl:7