Comments (4)
(b) After adding the result of the third convolution to the result from the other branch, the final results will be passed to ReLU6 and then quantized after ReLU6
I think there is no ReLU6 append after elementwise addition in mobilenetv2. Quantization might be lost as @dreambear1234 said.
from zeroq.
Hi,
Thanks for your interest in our work and your detailed questions. It should be noted that currently we are not doing fully quantized model since a fully quantized model needs to match specific hardware details which may vary significantly across CPUs, GPUs or FPGAs. As a general quantization method, we believe model size is important to the memory bottleneck and multiplications are the main part of the computation, so we tried to compress all weights in a neural network and make all the multiplications performed in low-precision. Please see the detailed answers below:
- Typically the models that we studied do not use bias. Leaving bias not quantized will not cause floating-point multiplications as there are ReLUs after convolutions where bias is merged into the activations.
- (a) The input images are ranged from 0-255 which are already quantized versions, so there is no need to quantize them. (b) After adding the result of the third convolution to the result from the other branch, the final results will be passed to ReLU6 and then quantized after ReLU6. Therefore, it only incurs floating-point addition. (c) The same reason as (b).
- Here we did not fuse batch norm and convolutional layer as the batch normalization is essentially a linear transformation, which can be fused into the scaling factor in the quantization operation and will not hurt the accuracy.
- We also observed the same phenomenon which we believe is normal.
from zeroq.
Any more comments?
from zeroq.
Thanks for the nice paper. I am assuming this is not the complete code of the paper? All of the things @dreambear1234 said are still valid and a concern.
@yaohuicai, how are you doing multiplications in low-precision? Based on this code ( https://github.com/amirgholami/ZeroQ/blob/master/classification/utils/quantize_model.py#L45 ) , only acts after ReLU is quantized and you lose quantization on Conv and FC layers. I guess your answer to Q3 is simply not correct. We have tried it and fused BN is affecting the accuracy in quantized models considerably.
Please update us on whether there is an updated code available, otherwise, I believe this code won't give a fair comparison against other methods. We are planning to move only the distillation
code to the distiller framework, have the proper quantization on activations, do batch norm folding, etc.
from zeroq.
Related Issues (20)
- Reproduction and Auto-Mixed Quantization? HOT 2
- How much calibration data is needed? HOT 3
- Where could I find low bit quantization code.
- increased inference latency for quantized model HOT 4
- The accuracy of pytorch official model is 0.3% lower than full precision HOT 1
- bitwidth of each layer (discussion of MP) HOT 2
- Export quantized model into pth file
- How to Train A Quantized SSD Detector?
- 生成数据时,提取的网络激活位置是否有bug? HOT 1
- Backpropagation function for quantized model
- is the initialized data from a uniform distribution instead of a gaussian distribution?
- Could you share the slides of the oral report? Thanks
- The result of using original ImageNet dataset to calibrate 4bit model is worse than ZeroQ, why is that?
- Why do the weights need to be dequantized after one quantization? HOT 1
- Runtime error when running uniform_test.py
- Welcome update to OpenMMLab 2.0
- Model remains float32 type after quantization HOT 1
- How is the Mixed Precision bit setting is automated? HOT 1
- Is the proposed method a Offline quantization or Run-time quantization? HOT 2
- Difference in Baseline FP32 accuracy numbers for MobileNetV2 and ResNet18 as compared to DFQ HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from zeroq.