Giter Club home page Giter Club logo

Comments (5)

whereistejas avatar whereistejas commented on May 30, 2024

Hi @stefan-k,
I'm interested in this particular issue and would like to help fix it (if it is still relevant). Could you give me some pointers on where to get started? I'm also interested in contributing in general, so please feel to direct to me more urgent or important issues. :)

from argmin.

stefan-k avatar stefan-k commented on May 30, 2024

Hi @whereistejas ,

thanks for your interest to contribute! It's been a while since I opened this issue, so I can't remember the details anymore. The implementation follows this paper [1]. The code can be found in [2]. I think a good way to move forward is to reproduce the errors by running the steepestdescent and nonlinear_cg examples with the HagerZhangLineSearch. I suspect a wrong <, >, <= or >=.

While going through the implementation, I noticed a comment that not all stopping criteria are implemented, but I don't think that this is causing the problem.

If you have any questions feel free to get in touch via the issues or via gitter. I'm happy to help, but I may unfortunately be a bit slow to respond currently!

[1] William W. Hager and Hongchao Zhang. "A new conjugate gradient method with guaranteed descent and an efficient line search." SIAM J. Optim. 16(1), 2006, 170-192.
[2] https://github.com/argmin-rs/argmin/blob/master/src/solver/linesearch/hagerzhang.rs

from argmin.

whereistejas avatar whereistejas commented on May 30, 2024

Hi @stefan-k ,
I compared the output of the two line search methods in steepestdescent.

The output for Hager-Zhang method is:

[
  1.1999999999997424,
  1.2000000000001065
]

Where as the output for More-Thuente method is:

[
  1.0724442427550391,
  1.151531565672518
]

So, I'm guessing, this is the discrepancy you are talking about?

I also cross checked the algorithm:

fn update<O: ArgminOp<Param = P, Output = F>>(
&mut self,
op: &mut OpWrapper<O>,
(a_x, a_f, a_g): Triplet<F>,
(b_x, b_f, b_g): Triplet<F>,
(c_x, c_f, c_g): Triplet<F>,
) -> Result<(Triplet<F>, Triplet<F>), Error> {
// U0
if c_x <= a_x || c_x >= b_x {
// nothing changes.
return Ok(((a_x, a_f, a_g), (b_x, b_f, b_g)));
}
// U1
if c_g >= F::from_f64(0.0).unwrap() {
return Ok(((a_x, a_f, a_g), (c_x, c_f, c_g)));
}
// U2
if c_g < F::from_f64(0.0).unwrap() && c_f <= self.finit + self.epsilon_k {
return Ok(((c_x, c_f, c_g), (b_x, b_f, b_g)));
}
// U3
if c_g < F::from_f64(0.0).unwrap() && c_f > self.finit + self.epsilon_k {
let mut ah_x = a_x;
let mut ah_f = a_f;
let mut ah_g = a_g;
let mut bh_x = c_x;
loop {
let d_x = (F::from_f64(1.0).unwrap() - self.theta) * ah_x + self.theta * bh_x;
let d_f = self.calc(op, d_x)?;
let d_g = self.calc_grad(op, d_x)?;
if d_g >= F::from_f64(0.0).unwrap() {
return Ok(((ah_x, ah_f, ah_g), (d_x, d_f, d_g)));
}
if d_g < F::from_f64(0.0).unwrap() && d_f <= self.finit + self.epsilon_k {
ah_x = d_x;
ah_f = d_f;
ah_g = d_g;
}
if d_g < F::from_f64(0.0).unwrap() && d_f > self.finit + self.epsilon_k {
bh_x = d_x;
}
}
}
// return Ok(((a_x, a_f, a_g), (b_x, b_f, b_g)));
Err(ArgminError::InvalidParameter {
text: "HagerZhangLineSearch: Reached unreachable point in `update` method.".to_string(),
}
.into())
}

With the algorithm in the paper, I couldn't find any mistakes.

from argmin.

stefan-k avatar stefan-k commented on May 30, 2024

Thanks for comparing the code with the paper! Yes, this is the discrepancy I was talking about. As you can see, it barely moves away from the initial position.
I have to admit, I'm a bit lost as to how to proceed. When looking at the implementation in Julia you can see from the comments at the top of the file that the original code (by the author of the paper) "has undergone numerous revisions since publication of the paper". Therefore, this may be a weird edge case which was fixed after publication of the paper (just a guess).
Do you want to have a look at this? It might be pretty tedious to go through the original code and figure out what was changed and what might be going wrong in our implementation. This problem definitely needs fixing, but I can totally understand if you want to tackle a problem which is more interesting and satisfying as your first contribution. If you want to set this issue aside for now and work on something else then I suggest you get in touch in the Gitter channel and we can find something interesting for you.

from argmin.

whereistejas avatar whereistejas commented on May 30, 2024

Hi @stefan-k,
I'm fine with working on this issue.

On further comparison, I did find some differences:

// S2
if (c_x - bb_x).abs() < F::epsilon() {
c_bar_x = self.secant(b_x, b_g, bb_x, bb_g);
}
// S3
if (c_x - aa_x).abs() < F::epsilon() {
c_bar_x = self.secant(a_x, a_g, aa_x, aa_g);
}
// S4
if (c_x - aa_x).abs() < F::epsilon() || (c_x - bb_x).abs() < F::epsilon() {

The if conditions here are different from what is mentioned in the paper:
double-secant

This is my current lead.

from argmin.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.