Giter Club home page Giter Club logo

ada-hessian's People

Contributors

davda54 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

ada-hessian's Issues

AttributeError: 'Parameter' object has no attribute 'hess'

Hi David,

Thanks for this repo. I tried it as an optimizer on toy examples and it worked fine but am struggling to make it work for an object detection case. Specifically, I am trying to use this as an optimizer for Yolo v5 from Ultralytics.

When I updated the train.py file there, I am facing some issues at the time of training. It uses a scheduler and I was wondering whether this is the source of the problem.
Here is part of the error log:

 File "train.py", line 309, in train
    optimizer.step()
  File "/usr/local/lib/python3.6/dist-packages/torch/optim/lr_scheduler.py", line 67, in wrapper
    return wrapped(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/autograd/grad_mode.py", line 26, in decorate_context
    return func(*args, **kwargs)
  File "/yolov5/ada_hessian.py", line 100, in step
    self.zero_hessian()
  File "/yolov5/ada_hessian.py", line 59, in zero_hessian
    if not isinstance(p.hess, float) and self.state[p]["hessian step"] % self.update_each == 0:
AttributeError: 'Parameter' object has no attribute 'hess'

Do you have any suggestions?

Regards,
Sam

question: about create_graph

Hello,

First of all, appreciated this repo. I just wanna ask about the reason for the create_graph attr being True in this case.

Thanks,
Nihat

Accumulation of gradient and Hessian

Hi,

I tried the "Advanced usage" example code you showed in your README to accumulate multiple Hessians. I did this by calling set_hessian() multiple times for the same input to get a better estimate for the Hessian (see paper formula 11). However, the estimate got worse when increasing the number of samples.

It seems like the gradient is accumulated, but the individual Hessian estimates are computed from the so-far accumulated gradients instead of just the most rescent added gradient. Say we have gradients g1, g2, g3. The Hessian-vector product estimates should be computed using g1 to get H1, g2 to get H2 and g3 to get H3, but instead, they are computed from g1, (g1+g2), (g1+g2+g3).

So, coming back to my original example, where I have g1=g2=g3 (because of same input), the Hessian is by a factor of (1+2+3) too large, and if I divide the Hessian by this value, the result is as expected.
An example where this is good to see is something like f(x,y)=x^2+y^2 with betas=(0, 0) and lr=1, where the optimizer should jump directly to the minimum, however, without the mentioned correction it needs multiple steps.

bug about generater

hi, I find some err when trying to use the adahessian optimizer, like
''AttributeError: 'torch._C.Generator' object has no attribute 'device'",
pytorch version: 1.1.0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.