Giter Club home page Giter Club logo

Comments (6)

PatWie avatar PatWie commented on June 7, 2024

Good catch! Thanks! I modified the linesearch by using the hessian for a better approximation. The unit-test starts now in "-1.2,1;". It works 👍

see 804ebf7

from cppnumericalsolvers.

tosttost avatar tosttost commented on June 7, 2024

Nice work.

Did you use any literature for that? Your fix destroys the descend property.
For the starting point -1.2, 100 there is an iteration in which the linesearch terminates with f = 110.552, but f_in is only 92.0189.

A few iterations later H is
-0.0470248 0.754048
0.754048 -12.0495
This Matrix is clearly not positive definite and the algorithm terminates with NaN, NaN.

PS a good description of a linesearch algorithm can be found in Rodger Fletcher, Practical Methods of Optimization, Second Edition (the first edition is not very useful). This linesearch is used in the gnu scientific library. I meanwhile wrote my own BFGS solver based on that and atlas - my app is in D and it avoids interfacing to C++...

PPS: H is always symmetric. Your code seems not to use that fact - may Eigens rankUpdate(*) functions for Symmetric/selfadjoint views would improve the performance.

from cppnumericalsolvers.

PatWie avatar PatWie commented on June 7, 2024

My intension was to use a better approximation (2nd order instead of 1st order) for the line search. Now, I found

"THE PERFORMANCE OF A MODIFIED ARMIJO LINE SEARCH RULE IN BFGS OPTIMIZATION METHOD"
(http://www.ccms.or.kr/data/pdfpaper/jcms21_1/21_1_117.pdf)

See "Algorithm 2. Modifed Armijo line search rule". They used a similiar idea.

Can you upload your line search method (in Dlang) to Github-Gist, so I can look into it?
PS: Feel free to make a pull request or a fork.

from cppnumericalsolvers.

tesch1 avatar tesch1 commented on June 7, 2024

I have a frankensteinish fork of this repo with the line search from lbfgs.c patched in (https://github.com/chokkan/liblbfgs) .. it works but it's .. let's say, not the prettiest thing in the world. I'm not really sure if the same conditions are appropriate for L-BFGS-B linesearch, it seems to stall out sometimes unable to find a suitable interval. Also appears not to work for newton, haven't looked at why.
In addition, there are wrappers for Ipopt and Hager's ASA_CG (http://users.clas.ufl.edu/hager/papers/Software/)
Also- the solvers are templated - one of the aims was auto differentiation using dual/hyperduals, or Eigen's autodiff, as of yet unfinished.

from cppnumericalsolvers.

tosttost avatar tosttost commented on June 7, 2024

"My intension was to use a better approximation (2nd order instead of 1st order) for the line search."
As far as I see neither armijo rule nor your fix does not use any approximation of the function in the search direction to find the stepwidth. You only modified the acceptance criterion based on H. Note that H in BFGS is not the hessian but converges against the inverse of the hessian. The acceptance criterion uses the slope of f in the search direction (and with the fix, H) to ensure sufficient reduction of the function value. This alone is not sufficient for BFGS.
For BFGS the (strong) Wolfe conditions work well:

  1. f(x + t_d) <= f(x) + t_constparam_f'(x) // satisfied with the armijo rule
  2. |f'(x + t_d)| <= -sigma_f'(x) // not satisfied with the armijo rule
    with
    sigma < 1 and f'() = d.transpose_grad_f()

I do not know about lbfgs.c linesearch but a the wolfe conditions are mention in the docs, so it probably works.

"THE PERFORMANCE OF A MODIFIED ARMIJO LINE SEARCH RULE IN BFGS OPTIMIZATION METHOD"
I am not sure that is a good paper. I take a quick look at reference [9] where the new linesearch is presented. BFGS is not mentioned at all and the BFGS would have to work with B=H.inverse() in order to have an approximation of the hessian. This requires to solve a set of linear linear equations (or to invert a matrix) in order to calculate the search direction (and use the BFGS update formula for B).

My implementation of Flechter's linesearch needs a lot of refactoring and it still evaluates the function more often than reported by Fletcher. The algorithm is much longer than the Armijo rule. At the moment it exists only on my disk and needs some work before it can be published on github. I have a lot of other work, so it could take some time.

May the GSL implementation (written in C) helps - it is based on the same description:

Pull request / fork: not a real option, because for my purpose it is best to have no C++ code (in special, no C++ templates) at all. Additionally my knowledge of C++ is quite limited - sorry!

from cppnumericalsolvers.

PatWie avatar PatWie commented on June 7, 2024

It seems to be the best way to implement the line search algorithm by Jorge J. More' and David J. Thuente
"Line Search Algorithms with Guaranteed Sufficient Decrease"

@tesch1:
I saw your code. My problem with a templated solvers is a Matlab-binding. It is inappropriate to recompile the source code for each problem statement in Matlab. Anyway, please drop me an email (base64: cGF0cmlja0B3aWVzY2hvbGxlay5pbmZv), because I am working on a nice, slightly different idea for this project. Maybe, a templated version works for me, too.

In general it would be great to continue this discussion here:
https://groups.google.com/forum/?hl=de#!forum/cppnumericalsolver

from cppnumericalsolvers.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.