juliafirstorder / proximalalgorithms.jl Goto Github PK
View Code? Open in Web Editor NEWProximal algorithms for nonsmooth optimization in Julia
License: Other
Proximal algorithms for nonsmooth optimization in Julia
License: Other
Some algorithms code may be improved (and maybe sped up?) by using the 5-arguments version of mul!
(requires Julia 1.3), see https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#LinearAlgebra.mul!
For example, the code here could probably use it:
Some of the methods which are usually implemented for Julia iterators are not yet defined on the algorithms iterator types:
IteratorEltype
eltype
I have a question about the Iteratortools module. I would like to apply the sampling method with different frequencies. Lets say I want to display results of an algorithm with frequency 100 (as you have with the verbose option), and save some variables of the algorithm every 10 iteration (for plotting). My current workaround is to do something like this
if report iter = tee(iter, fun) end
if verbose iter = tee(sample(iter, freq), disp) end
where fun is saving the data if iteration number is divisible by 10. I am wondering if there a better way of doing this that I missed?
I am having issues running the following trivial problem in ProximalAlgorithms.jl:
using ProximalAlgorithms, ProximalOperators
# optimization problem:
# minimize || X ||_{*} // nuclear norm
# subject to X >= 0
f = IndPSD() # indicator function for PSD matrices
g = NuclearNorm() # nuclear norm function
x0 = zeros(10,10) # this is also the optimal solution
solver = ProximalAlgorithms.DouglasRachford{R}(gamma=Float64(1), tol=TOL, verbose = true, freq = 1)
y, z, it = solver(x0, f=f, g=g)
where I am getting the following error:
MethodError: no method matching prox!(::Array{Float64,2}, ::IndPSD, ::Array{Float64,2}, ::Float64)
Closest candidates are:
prox!(::AbstractArray{T,N} where N, !Matched::CubeNormL2{R}, ::AbstractArray{T,N} where N, ::R) where {R, T<:Union{Complex{R}, R}} at C:\Users\shuvo\.julia\packages\ProximalOperators\18mAn\src\functions\cubeNormL2.jl:42
prox!(::AbstractArray{R,N} where N, !Matched::ElasticNet{R}, ::AbstractArray{R,N} where N, ::R) where R<:Real at C:\Users\shuvo\.julia\packages\ProximalOperators\18mAn\src\functions\elasticNet.jl:39
prox!(::AbstractArray{T,N} where N, !Matched::HuberLoss, ::AbstractArray{T,N} where N, ::Real) where T<:Union{Real, Complex} at C:\Users\shuvo\.julia\packages\ProximalOperators\18mAn\src\functions\huberLoss.jl:60
...
iterate(::ProximalAlgorithms.DRS_iterable{Float64,Float64,Array{Float64,2},IndPSD,NuclearNorm{Float64}}, ::ProximalAlgorithms.DRS_state{Array{Float64,2}}) at douglasrachford.jl:30
iterate(::ProximalAlgorithms.DRS_iterable{Float64,Float64,Array{Float64,2},IndPSD,NuclearNorm{Float64}}) at douglasrachford.jl:30
iterate(::ProximalAlgorithms.IterationTools.HaltingIterable{ProximalAlgorithms.DRS_iterable{Float64,Float64,Array{Float64,2},IndPSD,NuclearNorm{Float64}},ProximalAlgorithms.var"#stop#25"{ProximalAlgorithms.DouglasRachford{Float64}}}) at iterationtools.jl:20
iterate at iterators.jl:587 [inlined]
iterate at iterators.jl:585 [inlined]
iterate at iterators.jl:139 [inlined]
iterate at iterators.jl:138 [inlined]
iterate at iterationtools.jl:79 [inlined]
iterate at iterationtools.jl:79 [inlined]
iterate at iterationtools.jl:52 [inlined]
loop at iterationtools.jl:126 [inlined]
#_#24(::IndPSD, ::NuclearNorm{Float64}, ::ProximalAlgorithms.DouglasRachford{Float64}, ::Array{Float64,2}) at douglasrachford.jl:70
(::Core.var"#kw#Any")(::NamedTuple{(:f, :g),Tuple{IndPSD,NuclearNorm{Float64}}}, ::ProximalAlgorithms.DouglasRachford{Float64}, ::Array{Float64,2}) at none:0
top-level scope at testing_proximal_algorithms.jl:58
Please let me know how I can resolve the issue.
Hi Lorenzo,
I was trying to open a pull request for an update on the stepsizes for the primal-dual solver, but I forgot to change my remote and ended up committing to the main ProximalAlgorithms repo. I subsequently reverted to your last commit. Sorry for that.
If you don't want questions in your issues, please just delete.
We have complicated constraints that can not (or at least it is not obvious how to) be enforced via a proximal operator. However, with Optim.jl we've had good experiences just returning Inf
whenever the constraints are not met (and there are some theoretical reasons why this should work as well as a lot of practical experience).
Using ProximalAlgorithms, this strategy leads to convergence problems. Do you know if ProximalAlgorithms handles Inf
values differently than Optim? Is there any advantage of returning floatmax(...)
or a very high number instead of Inf
?
Writing ProximalAlgorithms.ForwardBackward()
is quite long.
Is there a particular why the constructors are not exported?
A list of relevant changes is here.
It will be important to update the iterators structure according to the new protocol (cfr. the old protocol).
A PkgEval run for a PR which changes the generated numbers for randn!
indicates that the tests of this package might fail in Julia 1.5 (and on Julia current master). Apologies if this is a false positive.
cf. https://github.com/JuliaCI/NanosoldierReports/blob/7de24e455342298cbef56826b5827f0d7640d2c1/pkgeval/by_hash/b89e35c_vs_098ef24/logs/ProximalAlgorithms/1.5.0-DEV-71a4a114c2.log
Hi there!
I was wondering if you'd be interested in a PR that abstracts away the Zygote dependency by adopting the formalism from AbstractDifferentiation?
Right now the package falls back to Zygote for computing gradients using automatic differentiation. It would be nice to add support for multiple AD backends (ReverseDiff, Enzyme, Yota, Nabla, Diffractor… anything that works) so that one can chose what should be used.
This would require a mechanism for “optional dependencies”, so that a certain backend can be used only if the required dependency is installed.
Two ways I see for selecting what backend to use:
ZygoteFunction(f)
, ReverseDiffFunction(f)
, …Should be easy to implement?
For example in PANOC
if warnings
@warn "parameter `gamma` became too small ($(state.gamma)), stopping the iterations"
end
I would also default it to warnings=false
.
Currently REQUIRE has RecursiveArrayTools and AbstractOperators as dependencies.
These however can be removed:
utilities/tomove.jl
which extend prox
, gradient
and their in-place versions to ArrayPartition
s. If we move these to ProximalOperators then this dependency won't be necessary.It looks like may be using norm(matrix)
. In Julia 0.7, this will compute the Frobenius norm (vecnorm
in Julia 0.6), due to JuliaLang/julia#27401. If you want the induced/operator norm as in Julia 0.6, use opnorm(matrix)
instead, or Compat.opnorm(matrix)
to work in 0.6 and 0.7 (JuliaLang/Compat.jl#577).
Note that, for testing purposes, rather than @test norm(A - B) ≤ tol
, it is usually preferred to do @test A ≈ B
or @test A ≈ B rtol=...
(which uses isapprox
).
Some things that could improve about verbose mode:
@printf
macro for printing, but one could use logging facilities instead, see https://docs.julialang.org/en/v1/stdlib/Logging/#Loggingcorrect implementation is in PANOC.jl
I managed to get most algorithms running on the Sparse regression example, but not the DRLS algorithm. I guess it should be applicable to sparse regression, but what needs to be done with:
drls = ProximalAlgorithms.DRLS(tol=1e-4,verbose=true)
solution_dr, iterations_dr = drls(x0=zeros(n_features+1),f=training_loss, g=reg,gamma=1e-7)
Related to #77
This issue is used to trigger TagBot; feel free to unsubscribe.
If you haven't already, you should update your TagBot.yml
to include issue comment triggers.
Please see this post on Discourse for instructions and more details.
If you'd like for me to do this for you, comment TagBot fix
on this issue.
I'll open a PR within a few hours, please be patient!
Hello,
Thank you for developing and maintaining such a useful package :)
I'm trying to figure out the best way to deal with multidimensional arrays. I've touched this subject before in kul-optec/AbstractOperators.jl#13, which helped but my method is still sub-optimal. I've tried to make a MWE to illustrate my workflow:
using ImageFiltering, FFTW, AbstractOperators, ProximalAlgorithms, ProximalOperators
n = 25
p1 = Kernel.gaussian(3)
p2 = Kernel.gaussian(6)
xtrue = zeros(n,n,2)
xtrue[13,13,:] .= 1
b1 = imfilter(xtrue[:,:,1], p1)
b2 = imfilter(xtrue[:,:,2], p2)
b = zeros(n,n,2)
b[:,:,1] = b1
b[:,:,2] = b2
sp = round.(Int,(size(p2) .- size(p1) )./2)
p1_pad = parent(padarray(p1, Fill(0,sp,sp)))
kern = zeros(n,n,2)
kern[:,:,1] = p1_pad
kern[:,:,2] = p2
center = round.(Int,size(p2)./2).+1
Fps = fft(circshift(kern,center),(1,2))
ifft_F = IDFT(Complex{Float64}, (n,n))
fft_F = DFT((n,n))
solver = ProximalAlgorithms.ForwardBackward(tol = 1e-12;maxit=20000,fast=true,adaptive=true)
g = IndNonnegative()
bc = complex(b)
xp1 = similar(kern)
for i = 1:size(kern,3)
f = Translate(SqrNormL2(), -bc[:,:,i])
psf = DiagOp(Fps[:,:,i])
A = Compose(ifft_F,Compose(psf,fft_F))
xp1[:,:,i],_ = solver(zeros(n,n), f=f, A=A, g=g)
end
all(isapprox.(xp1[13,13,:],1.0;rtol=1e-4)) # true
##### 3D CASE: #######
ifft_F = IDFT(Complex{Float64}, (n,n,2),(1,2))
fft_F = DFT((n,n,2),(1,2))
f = Translate(SqrNormL2(), -bc)
psf = DiagOp(Fps)
A = Compose(ifft_F,Compose(psf,fft_F))
xp2 = similar(kern)
xp2, _= solver(zeros(n,n,2), f=f, A=A, g=g)
all(isapprox.(xp2[13,13,:],1.0;rtol=1e-4)) # false
Basically, I'm doing image deblurring in acoustic imaging, where I have many 2D 'images' for different frequency bins. So in my setup I'm doing a lot of processing on 3D arrays. In the above example I therefore need to loop over the 3rd dimension of kern/psf
and b
and construct f
and A
in each loop.
I see two optimization strategies here:
Implement broadcasting for operators such as Translate
, DiagOp
and Compose
to allow in-place operations .=
Modify solvers to accept 3D arrays (it actually already does, see MWE) and calculate gamma
from a slice along a certain dimension (this is currently the issue I'm facing).
I might have misunderstood the solver interface, so forgive me if I'm mistaken. I would appreciate any comments on how to improve the performance.
Thanks!
/Oliver
Currently, the step size gamma
parameter can only decrease, eg at
state.gamma *= 1.02
in the iteration loop, and seem to get ~2x faster convergence on some problems.Does this modification make sense in general? For some algorithms?
Should it be included into the package implementation itself?
I was looking at the Davis and Yin (2017) paper and their description of Algorithm 2. I'm a bit confused how their notation corresponds to the one in the manual. They also use f(x)+g(x)+h(x), but they take the gradient of h(x) whereas in the documentation the gradient is taken of f(x). Can you clarify the relation between the documentation and Algorithm 2?
I get an error when I try to run the Sparse Regression example in the documentation:
UndefVarError: FastForwardBackward not defined
getproperty(x::Module, f::Symbol) at Base.jl:35
top-level scope at ProxAlgLASSO.jl:47
eval at boot.jl:373 [inlined]
In order to get type stable iterators we should specify the types of the fields of these objects.
For example the FBSIterator
would become:
mutable struct FBSIterator{I <: Integer, R <: Real,
D <: BlockArray, C <: BlockArray,
FS, AS, FQ, AQ, G} <: ProximalAlgorithm{I, D}
x::D
fs::FS
As::AS
fq::FQ
Aq::AQ
g::G
gamma::R
maxit::I
tol::R
...
Notice that I added also C <: BlockArray
: this is because preallocated outputs coming from e.g. Aq*x
can have different type with respect to x
(for example they might be complex).
This gives a problem since the constructor of FBSIterator
currently has Aq=Identity(blocksize(x0))
in its keyword arguments.
We should find a way to make sure that Aq
and As
are mapping to the same codomain / type.
I think we should abandon Identity
in favour of Zeros
of AbstractOperators
.
The default key argument of Aq
and As
would become:
Aq = Zeros(blockeltype(x0), blocksize(x0), blocksize(x0) )
Then we would need a utility function inside the constructors of the iterators that takes as inputs Aq
and As
and ensures these have the same codomain and domain.
Moreover if either one of Aq
or As
is of type Zeros
the function substitutes this with a new Zeros
object that has consistent codomain with the operator that is not.
Alternatively we could allow for different codomain/types of Aq
and As
with Cq
and Cs
.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.