davda54 / ada-hessian Goto Github PK
View Code? Open in Web Editor NEWEasy-to-use AdaHessian optimizer (PyTorch)
License: MIT License
Easy-to-use AdaHessian optimizer (PyTorch)
License: MIT License
hi, I find some err when trying to use the adahessian optimizer, like
''AttributeError: 'torch._C.Generator' object has no attribute 'device'",
pytorch version: 1.1.0
Hi,
I tried the "Advanced usage" example code you showed in your README to accumulate multiple Hessians. I did this by calling set_hessian() multiple times for the same input to get a better estimate for the Hessian (see paper formula 11). However, the estimate got worse when increasing the number of samples.
It seems like the gradient is accumulated, but the individual Hessian estimates are computed from the so-far accumulated gradients instead of just the most rescent added gradient. Say we have gradients g1, g2, g3. The Hessian-vector product estimates should be computed using g1 to get H1, g2 to get H2 and g3 to get H3, but instead, they are computed from g1, (g1+g2), (g1+g2+g3).
So, coming back to my original example, where I have g1=g2=g3 (because of same input), the Hessian is by a factor of (1+2+3) too large, and if I divide the Hessian by this value, the result is as expected.
An example where this is good to see is something like f(x,y)=x^2+y^2 with betas=(0, 0) and lr=1, where the optimizer should jump directly to the minimum, however, without the mentioned correction it needs multiple steps.
Hi David,
Thanks for this repo. I tried it as an optimizer on toy examples and it worked fine but am struggling to make it work for an object detection case. Specifically, I am trying to use this as an optimizer for Yolo v5 from Ultralytics.
When I updated the train.py file there, I am facing some issues at the time of training. It uses a scheduler and I was wondering whether this is the source of the problem.
Here is part of the error log:
File "train.py", line 309, in train
optimizer.step()
File "/usr/local/lib/python3.6/dist-packages/torch/optim/lr_scheduler.py", line 67, in wrapper
return wrapped(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/autograd/grad_mode.py", line 26, in decorate_context
return func(*args, **kwargs)
File "/yolov5/ada_hessian.py", line 100, in step
self.zero_hessian()
File "/yolov5/ada_hessian.py", line 59, in zero_hessian
if not isinstance(p.hess, float) and self.state[p]["hessian step"] % self.update_each == 0:
AttributeError: 'Parameter' object has no attribute 'hess'
Do you have any suggestions?
Regards,
Sam
Hello,
First of all, appreciated this repo. I just wanna ask about the reason for the create_graph attr being True in this case.
Thanks,
Nihat
Hi,
thank you for the implementation. Unfortunately I am having some problems with one variable that is not defined not sure what it's supposed to be:
File "adahessian.py", line 77, in set_hessian h_zs = torch.autograd.grad(grads, params, grad_outputs=z, only_inputs=True, retain_graph=False) UnboundLocalError: local variable 'z' referenced before assignment
Hi,
thanks for your re-implementation of the AdaHessian optimizer.
class AdaHessian(torch.optim.Optimizer)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.