The BFGS solver fails to solve the Rosenbrock problem for x0 = (-1.2, 1): it returns (

Bfgs terminates at nan about cppnumericalsolvers HOT 6 CLOSED

patwie commented on June 7, 2024

Bfgs terminates at nan

from cppnumericalsolvers.

Comments (6)

PatWie commented on June 7, 2024

Good catch! Thanks! I modified the linesearch by using the hessian for a better approximation. The unit-test starts now in "-1.2,1;". It works 👍

see 804ebf7

from cppnumericalsolvers.

tosttost commented on June 7, 2024

Nice work.

Did you use any literature for that? Your fix destroys the descend property.
For the starting point -1.2, 100 there is an iteration in which the linesearch terminates with f = 110.552, but f_in is only 92.0189.

A few iterations later H is
-0.0470248 0.754048
0.754048 -12.0495
This Matrix is clearly not positive definite and the algorithm terminates with NaN, NaN.

PS a good description of a linesearch algorithm can be found in Rodger Fletcher, Practical Methods of Optimization, Second Edition (the first edition is not very useful). This linesearch is used in the gnu scientific library. I meanwhile wrote my own BFGS solver based on that and atlas - my app is in D and it avoids interfacing to C++...

PPS: H is always symmetric. Your code seems not to use that fact - may Eigens rankUpdate(*) functions for Symmetric/selfadjoint views would improve the performance.

from cppnumericalsolvers.

PatWie commented on June 7, 2024

My intension was to use a better approximation (2nd order instead of 1st order) for the line search. Now, I found

"THE PERFORMANCE OF A MODIFIED ARMIJO LINE SEARCH RULE IN BFGS OPTIMIZATION METHOD"
(http://www.ccms.or.kr/data/pdfpaper/jcms21_1/21_1_117.pdf)

See "Algorithm 2. Modifed Armijo line search rule". They used a similiar idea.

Can you upload your line search method (in Dlang) to Github-Gist, so I can look into it?
PS: Feel free to make a pull request or a fork.

from cppnumericalsolvers.

tesch1 commented on June 7, 2024

I have a frankensteinish fork of this repo with the line search from lbfgs.c patched in (https://github.com/chokkan/liblbfgs) .. it works but it's .. let's say, not the prettiest thing in the world. I'm not really sure if the same conditions are appropriate for L-BFGS-B linesearch, it seems to stall out sometimes unable to find a suitable interval. Also appears not to work for newton, haven't looked at why.
In addition, there are wrappers for Ipopt and Hager's ASA_CG (http://users.clas.ufl.edu/hager/papers/Software/)
Also- the solvers are templated - one of the aims was auto differentiation using dual/hyperduals, or Eigen's autodiff, as of yet unfinished.

from cppnumericalsolvers.

tosttost commented on June 7, 2024

"My intension was to use a better approximation (2nd order instead of 1st order) for the line search."
As far as I see neither armijo rule nor your fix does not use any approximation of the function in the search direction to find the stepwidth. You only modified the acceptance criterion based on H. Note that H in BFGS is not the hessian but converges against the inverse of the hessian. The acceptance criterion uses the slope of f in the search direction (and with the fix, H) to ensure sufficient reduction of the function value. This alone is not sufficient for BFGS.
For BFGS the (strong) Wolfe conditions work well:

f(x + t_d) <= f(x) + t_constparam_f'(x) // satisfied with the armijo rule
|f'(x + t_d)| <= -sigma_f'(x) // not satisfied with the armijo rule
with
sigma < 1 and f'() = d.transpose_grad_f()

I do not know about lbfgs.c linesearch but a the wolfe conditions are mention in the docs, so it probably works.

"THE PERFORMANCE OF A MODIFIED ARMIJO LINE SEARCH RULE IN BFGS OPTIMIZATION METHOD"
I am not sure that is a good paper. I take a quick look at reference [9] where the new linesearch is presented. BFGS is not mentioned at all and the BFGS would have to work with B=H.inverse() in order to have an approximation of the hessian. This requires to solve a set of linear linear equations (or to invert a matrix) in order to calculate the search direction (and use the BFGS update formula for B).

My implementation of Flechter's linesearch needs a lot of refactoring and it still evaluates the function more often than reported by Fletcher. The algorithm is much longer than the Armijo rule. At the moment it exists only on my disk and needs some work before it can be published on github. I have a lot of other work, so it could take some time.

May the GSL implementation (written in C) helps - it is based on the same description:

http://www.gnu.org/software/gsl/
Download version 1.16 sourcecode
the linesearch is the "minimize" function in gsl-1.16/multimin/linear_minimize.c

Pull request / fork: not a real option, because for my purpose it is best to have no C++ code (in special, no C++ templates) at all. Additionally my knowledge of C++ is quite limited - sorry!

from cppnumericalsolvers.

PatWie commented on June 7, 2024

It seems to be the best way to implement the line search algorithm by Jorge J. More' and David J. Thuente
"Line Search Algorithms with Guaranteed Sufficient Decrease"

@tesch1:
I saw your code. My problem with a templated solvers is a Matlab-binding. It is inappropriate to recompile the source code for each problem statement in Matlab. Anyway, please drop me an email (base64: cGF0cmlja0B3aWVzY2hvbGxlay5pbmZv), because I am working on a nice, slightly different idea for this project. Maybe, a templated version works for me, too.

In general it would be great to continue this discussion here:
https://groups.google.com/forum/?hl=de#!forum/cppnumericalsolver

from cppnumericalsolvers.

Bfgs terminates at nan about cppnumericalsolvers HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent