Hi, wei, I have downloaded your code on my PC and use your code to test the training o

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Fine-tuning is required to recover accuracy after decomposing. Please do layer-w

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Training on ImageNet using ResNet-18 and not convergence. about caffe HOT 8 OPEN

weitaoatvison commented on June 27, 2024

Training on ImageNet using ResNet-18 and not convergence.

from caffe.

Comments (8)

wenwei202 commented on June 27, 2024

@weitaoatvison

Did you train it from scratch or fine-tune it? It's better to fine-tune
Did you try to use a smaller force_decay? force_decay should vary with your network architecture.

from caffe.

weitaoatvison commented on June 27, 2024

I tried a smaller force_decay and train it from scratch. It worked. But I still have some questions.

I trained the resnet18 using force_decay on Imagenet from scratch. And then I use nn_decomposer.py to do low-rank, the rank-ratio I set is 0.95. The original top5 is 0.89, but after decompose it drop to 0.34 without finetune, in the meantime, I test the speed on titan-x with cuda8.0,cudnn5.1, the baseline time consuming is 6.18ms, and After low-rank, the time value became 6.24ms(if training with out force_decay, the time consuming is 7.5ms), and it seems hard to achieve 2X speedup on GPU your paper declared. Is it right?
I want to know If I do low-rank layer by layer, will the final result be better than do low-rank on global net? for example, I decompose the first layer and then I fine-tuned it. After fine-tuning , I did decompose again to next layer.
Your work is very good and your advice will help me a lot. Thanks!

from caffe.

wenwei202 commented on June 27, 2024

Fine-tuning is required to recover accuracy after decomposing. Please do layer-wise timing to verify the bottleneck. The architecture of resnet is very different from alexnet.
Not sure how much better it will be, but the fine-tuning time will be significantly increased.

from caffe.

weitaoatvison commented on June 27, 2024

Thanks for your answer! I found in your paper you showed some results about resnet-20 and googlenet in Figure 5. Are these results trained on ImageNet? What is the speedup of them on GPU? And could you share the caffemodels for quick test? Thanks!

from caffe.

wenwei202 commented on June 27, 2024

ResNet is trained by cifar10 while Googlenet by ImageNet. I recommend to first test how low rank approximation accelerates them without force regularization. If it's promising, then you may use force regularization for higher speed.

from caffe.

weitaoatvison commented on June 27, 2024

Hi, I have trained the resnet18 use a higher force_regulation item, and at last it can achive top5 ~0.87(original is ~0.89). I do low-rank on this model with 0.95 rank-ratio and it can reduce the caffemodel size from 48MB -> 3.2MB(if using standard training it can only reduce from 48MB -> 36MB), I have check the prototxt for its num_output in each layer, I find the num_output of some layers reduced to 1. But when I use the low-rank model to fine-tune to original accuracy, when test at the start, top5 and top1 is nearly 0, and after some epochs, it still can't achieve a good result, the result top5 is ~30. So, I want to know if the method has a limitation when the rank is reduced to a very small number although I keep the rank ratio at 0.95?

from caffe.

wenwei202 commented on June 27, 2024

@weitaoatvison this is one of the open issues in this work pending to solve as I mentioned here. Current strategy is to use a smaller rank ratio. Let me know if you have some progress on this issue. Thanks.

from caffe.

weitaoatvison commented on June 27, 2024

OK. I will do more work on it. Thanks!

from caffe.

Training on ImageNet using ResNet-18 and not convergence. about caffe HOT 8 OPEN

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent