ist-daslab / obc Goto Github PK
View Code? Open in Web Editor NEWCode for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".
Home Page: https://arxiv.org/abs/2208.11580
Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".
Home Page: https://arxiv.org/abs/2208.11580
Hello, thanks for open-sourcing the great work!
I'm following the example on the main README for Mixed quantization + 2:4 pruning. I can run until python spdy.py rn18 imagenet 8 mixed --dp
without any problems, but running the DP step throws an error:
Traceback (most recent call last):
File "spdy.py", line 240, in <module>
print(get_score(np.ones(len(layers))))
File "spdy.py", line 204, in get_score
solution = dp(costs)
File "spdy.py", line 182, in dp
solution.append(PD[layer][timing])
IndexError: index -10219 is out of bounds for axis 0 with size 10001
The only changes I made were changing the path to imagenet and commenting out lines 233-248 in database.py. Is this a version incompatibility issue? Any help is appreciated!
Thanks for you wonderful work and available code!
However, when I use command python main_trueobs.py rn18 imagenet quant --wbits 4 --abits 4 --save rn18_4w4a.pth
to quantize the ResNet18 to 4w4a on Imagenet,I got a quantized model with 64.07% accuracy. This is much lower than the reported "69.56%" accuracy in Table 4 of the paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".
Then, I try to use "Batchnorm tuning" and "Statistics correction" to recover the accuracy as the authors suggest.
With the command python postproc.py rn18 imagenet rn18_4w4a.pth --bnt
, the model gets "65.92%" accuracy.
With the command python postproc.py rn18 imagenet rn18_4w4a.pth --statcorr --statcorr-samples 1024
, the model gets "65.76%" accuracy.
However, both of them still are much lower than the reported "69.56%" accuracy in the paper.
I also test the 8w8a quantized model and the performance is close to the original model as is shown in the following figure.
Note that I use PyTorch v1.9.0 to conduct experiments.
I want to know whether I run the wrong commands.
Maybe I ignore some important details so that I get a much worse quantized model than that reported in the paper?
Hello, I am interested in your work, it is really a very meaningful job.
I used it in yolov5s, and my model is a full-precision model model trained on the NWPU-VHR-10 remote sensing dataset, mAP0.5:89.4%. When I used 4W4A, I found that the effect was somewhat unsatisfactory, and the mAP was only more than 20%. It was found to be very sensitive to activation quantification. If I use 4W8A then it works great, 87.9%. Similarly, 8W4A does not work well. At the same time, my 2:4 sparseness effect is not ideal.
I don't know why this is happening. Looking forward to your reply.
Hello,
This is such an amazing work and I am trying to have a clear understanding of OBC by runing the code on my machine.
During debugging, when I printed out the Hinv of each layer, I noticed that some Hinvs have all Nan values. I guess that is because the corresponding H is not invertible or have zero or Nan determinant.
I wonder if you could suggest any potential solutions.
Thank you in advance.
Thank you for this work!
The default value of the variable self.norm
is 2.4
(used here). Why is that the case? More generally, what is the purpose of grid search over the values described by maxshrink
and grid
? I could not find anything related to it in the paper.
I noticed that in the TrueOBS.quantize
method, Hinv1
is calculated by the self.prepare()
method like so:
W, H, Hinv1, Losses = self.prepare()
Then the Hinv1
is repeated and slices of it are returned in this line by self.prepare()
method:
i2, count, w, Hinv, mask, rangecount, idxcount = self.prepare_iter(i1, parallel, W, Hinv1)
Finally, Hinv
is passed to self.prepare_sparse
method which recalculates the inverse Hessian for each row:
def prepare_sparse(self, w, mask, Hinv, H):
...
for i in range(w.shape[0]):
...
Hinv[i] = self.invert(H1)
return start
It seems like we could just pass the Hessian H
to the self.prepare_sparse
method and it could calculate the inverse Hessian only once.
Dear author,
Thanks for the great works.
I am interested about your work and trying to reproduce the results by following Non-unifrom compression section.
In Compute corresponding losses
steps, it seems it have to generate a scores results for next step, but there has two exit(), including
https://github.com/IST-DASLab/OBC/blob/main/database.py#L240
https://github.com/IST-DASLab/OBC/blob/main/database.py#L248
Will terminate the program early. Are these exit() for debugging purpose?
Thanks,
I would like to try experimenting with other models and datasets, such as CIFAR-10, but I notice a significant decrease in accuracy. Why is this happening? Do I need to modify other related parameters? Thanks again!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.