juliafirstorder / proximalalgorithms.jl Goto Github PK

Proximal algorithms for nonsmooth optimization in Julia

License: Other

Julia 99.74% Just 0.26%

proximal-algorithms optimization julia nonsmooth-optimization proximal-operators admm machine-learning optimal-control optimization-algorithms proximal-gradient-method

proximalalgorithms.jl's People

Contributors

Stargazers

Watchers

Forkers

pylat nantonel lostella jiaodaxiaozi junziz mrk1992 ky-lin22 mohamed82008 wwkong playfloor mpf-optimization-laboratory daniel-vainsencher alexanderbodard aldma parinchaipunya

proximalalgorithms.jl's Issues

Further allocation savings in FBS, PANOC

Some quantities computed in FBS and PANOC could probably be pre-allocated. We should consider doing this in case significant performance improvement is obtained.

Use 5-arguments mul!

Some algorithms code may be improved (and maybe sped up?) by using the 5-arguments version of mul! (requires Julia 1.3), see https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#LinearAlgebra.mul!

For example, the code here could probably use it:

ProximalAlgorithms.jl/src/algorithms/primal_dual.jl

Line 171 in 9fe03c1

mul!(state.temp_x, iter.L', state.y)

Missing iterator methods

Some of the methods which are usually implemented for Julia iterators are not yet defined on the algorithms iterator types:

IteratorEltype
eltype

I have a question about the Iteratortools module. I would like to apply the sampling method with different frequencies. Lets say I want to display results of an algorithm with frequency 100 (as you have with the verbose option), and save some variables of the algorithm every 10 iteration (for plotting). My current workaround is to do something like this

if report iter = tee(iter, fun) end
if verbose iter = tee(sample(iter, freq), disp) end

where fun is saving the data if iteration number is divisible by 10. I am wondering if there a better way of doing this that I missed?

Problem solving a simple SDP in ProximalAlgorithms.jl

I am having issues running the following trivial problem in ProximalAlgorithms.jl:

using ProximalAlgorithms, ProximalOperators

# optimization problem: 
# minimize || X ||_{*} // nuclear norm
# subject to X >= 0

f = IndPSD()  # indicator function for PSD matrices
g = NuclearNorm() # nuclear norm function

x0 = zeros(10,10) # this is also the optimal solution

solver = ProximalAlgorithms.DouglasRachford{R}(gamma=Float64(1), tol=TOL, verbose = true, freq = 1)

y, z, it = solver(x0, f=f, g=g)

where I am getting the following error:

MethodError: no method matching prox!(::Array{Float64,2}, ::IndPSD, ::Array{Float64,2}, ::Float64)
Closest candidates are:
  prox!(::AbstractArray{T,N} where N, !Matched::CubeNormL2{R}, ::AbstractArray{T,N} where N, ::R) where {R, T<:Union{Complex{R}, R}} at C:\Users\shuvo\.julia\packages\ProximalOperators\18mAn\src\functions\cubeNormL2.jl:42
  prox!(::AbstractArray{R,N} where N, !Matched::ElasticNet{R}, ::AbstractArray{R,N} where N, ::R) where R<:Real at C:\Users\shuvo\.julia\packages\ProximalOperators\18mAn\src\functions\elasticNet.jl:39
  prox!(::AbstractArray{T,N} where N, !Matched::HuberLoss, ::AbstractArray{T,N} where N, ::Real) where T<:Union{Real, Complex} at C:\Users\shuvo\.julia\packages\ProximalOperators\18mAn\src\functions\huberLoss.jl:60
  ...
iterate(::ProximalAlgorithms.DRS_iterable{Float64,Float64,Array{Float64,2},IndPSD,NuclearNorm{Float64}}, ::ProximalAlgorithms.DRS_state{Array{Float64,2}}) at douglasrachford.jl:30
iterate(::ProximalAlgorithms.DRS_iterable{Float64,Float64,Array{Float64,2},IndPSD,NuclearNorm{Float64}}) at douglasrachford.jl:30
iterate(::ProximalAlgorithms.IterationTools.HaltingIterable{ProximalAlgorithms.DRS_iterable{Float64,Float64,Array{Float64,2},IndPSD,NuclearNorm{Float64}},ProximalAlgorithms.var"#stop#25"{ProximalAlgorithms.DouglasRachford{Float64}}}) at iterationtools.jl:20
iterate at iterators.jl:587 [inlined]
iterate at iterators.jl:585 [inlined]
iterate at iterators.jl:139 [inlined]
iterate at iterators.jl:138 [inlined]
iterate at iterationtools.jl:79 [inlined]
iterate at iterationtools.jl:79 [inlined]
iterate at iterationtools.jl:52 [inlined]
loop at iterationtools.jl:126 [inlined]
#_#24(::IndPSD, ::NuclearNorm{Float64}, ::ProximalAlgorithms.DouglasRachford{Float64}, ::Array{Float64,2}) at douglasrachford.jl:70
(::Core.var"#kw#Any")(::NamedTuple{(:f, :g),Tuple{IndPSD,NuclearNorm{Float64}}}, ::ProximalAlgorithms.DouglasRachford{Float64}, ::Array{Float64,2}) at none:0
top-level scope at testing_proximal_algorithms.jl:58

Please let me know how I can resolve the issue.

pull request gone wrong

Hi Lorenzo,
I was trying to open a pull request for an update on the stepsizes for the primal-dual solver, but I forgot to change my remote and ended up committing to the main ProximalAlgorithms repo. I subsequently reverted to your last commit. Sorry for that.

[question] Achieve constraints via return of Inf

If you don't want questions in your issues, please just delete.

We have complicated constraints that can not (or at least it is not obvious how to) be enforced via a proximal operator. However, with Optim.jl we've had good experiences just returning Inf whenever the constraints are not met (and there are some theoretical reasons why this should work as well as a lot of practical experience).

Using ProximalAlgorithms, this strategy leads to convergence problems. Do you know if ProximalAlgorithms handles Inf values differently than Optim? Is there any advantage of returning floatmax(...) or a very high number instead of Inf?

Algorithms are not exported

Writing ProximalAlgorithms.ForwardBackward() is quite long.

Is there a particular why the constructors are not exported?

Upgrade to Julia 0.7

A list of relevant changes is here.

It will be important to update the iterators structure according to the new protocol (cfr. the old protocol).

Tests on number of iterations are sensitive to randn

A PkgEval run for a PR which changes the generated numbers for randn! indicates that the tests of this package might fail in Julia 1.5 (and on Julia current master). Apologies if this is a false positive.
cf. https://github.com/JuliaCI/NanosoldierReports/blob/7de24e455342298cbef56826b5827f0d7640d2c1/pkgeval/by_hash/b89e35c_vs_098ef24/logs/ProximalAlgorithms/1.5.0-DEV-71a4a114c2.log

Implement Davis-Yin splitting

Template already added here.

Reference paper here.

Remove Zygote dependency by switching to AbstractDifferentiation

Hi there!
I was wondering if you'd be interested in a PR that abstracts away the Zygote dependency by adopting the formalism from AbstractDifferentiation?

Autodiff backends

Right now the package falls back to Zygote for computing gradients using automatic differentiation. It would be nice to add support for multiple AD backends (ReverseDiff, Enzyme, Yota, Nabla, Diffractor… anything that works) so that one can chose what should be used.

This would require a mechanism for “optional dependencies”, so that a certain backend can be used only if the required dependency is installed.

Two ways I see for selecting what backend to use:

Some global variable
Wrapping functions in a specific type that enforces using this or that AD backend, eg, ZygoteFunction(f), ReverseDiffFunction(f), …

Add flag to hide warnings

Should be easy to implement?

For example in PANOC

if warnings
  @warn "parameter `gamma` became too small ($(state.gamma)), stopping the iterations"
end

I would also default it to warnings=false.

Remove some dependencies

Currently REQUIRE has RecursiveArrayTools and AbstractOperators as dependencies.

These however can be removed:

AbstractOperators is solely due to the LBFGS. #18 should remove this.
RecursiveArrayTools is due to the files contained in utilities/tomove.jl which extend prox, gradient and their in-place versions to ArrayPartitions. If we move these to ProximalOperators then this dependency won't be necessary.
Drop Julia 0.7 minor release required in version tag (i.e. v0.3.0)

norm(matrix) -> opnorm(matrix) in Julia 0.7

It looks like may be using norm(matrix). In Julia 0.7, this will compute the Frobenius norm (vecnorm in Julia 0.6), due to JuliaLang/julia#27401. If you want the induced/operator norm as in Julia 0.6, use opnorm(matrix) instead, or Compat.opnorm(matrix) to work in 0.6 and 0.7 (JuliaLang/Compat.jl#577).

Note that, for testing purposes, rather than @test norm(A - B) ≤ tol, it is usually preferred to do @test A ≈ B or @test A ≈ B rtol=... (which uses isapprox).

Improve verbose mode

Some things that could improve about verbose mode:

Currently, verbose execution of algorithms displays a sequence of unspecified values. Verbose mode should either display a “table header” at the beginning of the execution, or a field name next to each value on each line.
We currently use the @printf macro for printing, but one could use logging facilities instead, see https://docs.julialang.org/en/v1/stdlib/Logging/#Logging
The logging frequency is customizable but fixed (ie print every 10 or 100 or however many iterations). One way to “let the frequency scale” with the algorithm convergence speed is to use “logarithmic” logging: display progress every iteration until the 10th, then every 10 iterations until the 100th, then every 100 iterations until the 1000th, and so on.

fix lower bound to Lipschitz constant

correct implementation is in PANOC.jl

Documentation: expand sparse least squares example to other algorithms (Douglas-Rachford)

I managed to get most algorithms running on the Sparse regression example, but not the DRLS algorithm. I guess it should be applicable to sparse regression, but what needs to be done with:
drls = ProximalAlgorithms.DRLS(tol=1e-4,verbose=true)
solution_dr, iterations_dr = drls(x0=zeros(n_features+1),f=training_loss, g=reg,gamma=1e-7)

Clarify that ProximalOperator >= 0.15 is needed, if used

Make this clear in the documentation
(Maybe) One could check for ProximalOperator’s version upon loading the package, and print out a warning in case it’s < 0.15?

Related to #77

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Multidimensional arrays

Hello,

Thank you for developing and maintaining such a useful package :)

I'm trying to figure out the best way to deal with multidimensional arrays. I've touched this subject before in kul-optec/AbstractOperators.jl#13, which helped but my method is still sub-optimal. I've tried to make a MWE to illustrate my workflow:

using ImageFiltering, FFTW, AbstractOperators, ProximalAlgorithms, ProximalOperators

n = 25
p1 = Kernel.gaussian(3)
p2 = Kernel.gaussian(6)

xtrue = zeros(n,n,2)
xtrue[13,13,:] .= 1
b1 = imfilter(xtrue[:,:,1], p1)
b2 = imfilter(xtrue[:,:,2], p2)

b = zeros(n,n,2)
b[:,:,1] = b1
b[:,:,2] = b2

sp = round.(Int,(size(p2) .- size(p1) )./2)
p1_pad = parent(padarray(p1, Fill(0,sp,sp)))

kern = zeros(n,n,2)
kern[:,:,1] = p1_pad
kern[:,:,2] = p2

center = round.(Int,size(p2)./2).+1
Fps = fft(circshift(kern,center),(1,2))
ifft_F = IDFT(Complex{Float64}, (n,n))
fft_F = DFT((n,n))

solver = ProximalAlgorithms.ForwardBackward(tol = 1e-12;maxit=20000,fast=true,adaptive=true)
g = IndNonnegative()
bc = complex(b)
xp1 = similar(kern)
for i = 1:size(kern,3)
    f = Translate(SqrNormL2(), -bc[:,:,i])
    psf = DiagOp(Fps[:,:,i])
    A = Compose(ifft_F,Compose(psf,fft_F))
    xp1[:,:,i],_ = solver(zeros(n,n), f=f, A=A, g=g)
end

all(isapprox.(xp1[13,13,:],1.0;rtol=1e-4)) # true

##### 3D CASE: #######

ifft_F = IDFT(Complex{Float64}, (n,n,2),(1,2))
fft_F = DFT((n,n,2),(1,2))

f = Translate(SqrNormL2(), -bc)
psf = DiagOp(Fps)
A = Compose(ifft_F,Compose(psf,fft_F))
xp2 = similar(kern)
xp2, _= solver(zeros(n,n,2), f=f, A=A, g=g)

all(isapprox.(xp2[13,13,:],1.0;rtol=1e-4)) # false

Basically, I'm doing image deblurring in acoustic imaging, where I have many 2D 'images' for different frequency bins. So in my setup I'm doing a lot of processing on 3D arrays. In the above example I therefore need to loop over the 3rd dimension of kern/psf and b and construct f and A in each loop.

I see two optimization strategies here:

Implement broadcasting for operators such as Translate, DiagOp and Compose to allow in-place operations .=
Modify solvers to accept 3D arrays (it actually already does, see MWE) and calculate gamma from a slice along a certain dimension (this is currently the issue I'm facing).

I might have misunderstood the solver interface, so forgive me if I'm mistaken. I would appreciate any comments on how to improve the performance.

Thanks!
/Oliver

Auto increase gamma

Currently, the step size gamma parameter can only decrease, eg at

ProximalAlgorithms.jl/src/algorithms/fast_forward_backward.jl

Line 92 in dc0342c

gamma, state.g_z = backtrack_stepsize!(

However, sometimes larger step sizes can be performed at some point during iteration.
I tried doing state.gamma *= 1.02 in the iteration loop, and seem to get ~2x faster convergence on some problems.
Gamma vs iterations plot looks like this:

Does this modification make sense in general? For some algorithms?
Should it be included into the package implementation itself?

Question on DavisYin algorithm

I was looking at the Davis and Yin (2017) paper and their description of Algorithm 2. I'm a bit confused how their notation corresponds to the one in the manual. They also use f(x)+g(x)+h(x), but they take the gradient of h(x) whereas in the documentation the gradient is taken of f(x). Can you clarify the relation between the documentation and Algorithm 2?

Sparse regression example no longer working

I get an error when I try to run the Sparse Regression example in the documentation:
UndefVarError: FastForwardBackward not defined
getproperty(x::Module, f::Symbol) at Base.jl:35
top-level scope at ProxAlgLASSO.jl:47
eval at boot.jl:373 [inlined]

Type stability on Iterators

In order to get type stable iterators we should specify the types of the fields of these objects.
For example the FBSIterator would become:

mutable struct FBSIterator{I <: Integer, R <: Real, 
                                        D <: BlockArray, C <: BlockArray, 
                                        FS, AS, FQ, AQ, G} <: ProximalAlgorithm{I, D}
    x::D
    fs::FS
    As::AS
    fq::FQ
    Aq::AQ
    g::G
    gamma::R
    maxit::I
    tol::R
...

Notice that I added also C <: BlockArray: this is because preallocated outputs coming from e.g. Aq*x
can have different type with respect to x (for example they might be complex).

This gives a problem since the constructor of FBSIterator currently has Aq=Identity(blocksize(x0)) in its keyword arguments.
We should find a way to make sure that Aq and As are mapping to the same codomain / type.

I think we should abandon Identity in favour of Zeros of AbstractOperators.
The default key argument of Aq and As would become:

Aq = Zeros(blockeltype(x0), blocksize(x0), blocksize(x0) )

Then we would need a utility function inside the constructors of the iterators that takes as inputs Aq and As and ensures these have the same codomain and domain.
Moreover if either one of Aq or As is of type Zeros the function substitutes this with a new Zeros object that has consistent codomain with the operator that is not.

Alternatively we could allow for different codomain/types of Aq and As with Cq and Cs.