Comments (7)
@HaiPhan1991
1. pruning ( fine-tuning )
- Change the type of all the
conv
andip
layers to compress layer, as described in the README. - pruning conv layers: set the
iter_stop
inconv
layers to max_iter, whileip
layers to 0 or negative value (0 or negativeiter_stop
means no pruning). Then start training. - pruning ip layers: set the
iter_stop
inconv
layers to 0 or negative, whileip
layers to max_iter. Then start training. - done.
tips: for a proof of the pruning concept, don't set the c_rate
too large, [-1, 1] would be a good choice to start with, or try out with mnist first.
2. check the pruning rate
I used python scripts to do this. The python API is not provided in this repo, but you can work this around by compiling the caffe.proto
manually, then extract the weights/bias and mask blobs using your own python scripts.
I am also trying to apply DNS to a newer version of caffe, so that the python API could be used. Here is my repo. In my version, after compiling the caffe and pycaffe, prepare your compressed DNS caffemodel, and run the following command from your CAFFE_ROOT (make sure you have set CAFFE_ROOT
environment variable, which is the dir of you caffe folder) :
python compression_scripts/dns_to_normal.py <dns.prototxt> <dns_model.caffemodel> <target.prototxt> <output_target.caffemodel>
The compression rate should be shown on the screen, and the output_target.caffemodel
should have the same size as a normal caffemodel (about 1/2 of the dns_model.caffemodel) which can be tested.
e.g.
python compression_scripts/dns_to_normal.py examples/mnist/dns_train_val.prototxt examples/mnist/dns_iter_10000.caffemodel examples/mnist/mnist_train_val.prototxt examples/mnist/mnist_test.caffemodel
My repo is still under development, so a little bit messy with files and folders, but it works fine with small pruning rate (i.e. small c_rate
), but would be buggy with large c_rate
. Still working on it.
Hope this would help.
from dynamic-network-surgery.
Hi @kai-xie , I don't understand. If you keep the same hyper-parameters of conv layers in the second phase, wouldn't the algorithm keeps pruning these layers? When saying pruning the layers separately, I meant not to further prune or splice the conv layers when pruning fully connected layers (there are a bunch of ways but you can simply do this by setting a zero or negative number to iter_stop for these layers). Also, I didn't really encounter the learning rate problem as you did, but the pruning rates do change (obviously not to 100% or 0%) during training and that's how the algorithm works. As in your case, I think you can first try what I said and maybe larger c_rates in fully connected layers to see if the pruning still fails.
from dynamic-network-surgery.
@yiwenguo
Thanks for your reply!
I will try again to see how it works.
from dynamic-network-surgery.
It worked when training conv
and ip
layers seperately by controlling the iter_stop
. Thank you very much! @yiwenguo
from dynamic-network-surgery.
Hi @kai-xie ,
I have some concerns:
- Could you share the detail of prototxt files or step by step for pruning conv and ip layers separately?
- How do you know if the network is pruned to compare to the original model?
Thanks.
from dynamic-network-surgery.
Awesome! Thank you for your detail instructions. It's really helpful. I am doing on ImageNet dataset, hope it works well.
from dynamic-network-surgery.
Hi @kai-xie , I'm working on the DNS recently.
For Problem 3 you pointed out, does the constant setting of mu and std work finally?
I update the mu and std every iteration.
And I find the pruning rarely changes between iterations.
from dynamic-network-surgery.
Related Issues (20)
- mu std is nan problem
- question in Backward code HOT 4
- Lifecycle of using Dynamic-Network-Surgery HOT 1
- train result question HOT 2
- source code question
- request for a whole caffe version with a python and matlab ports HOT 1
- train problem HOT 3
- The size of model increases doubly. HOT 5
- fix your code in newest caffe branchใ HOT 1
- Pruned model size is the same as the original model HOT 1
- Values of threshold parameters HOT 1
- compilation error
- A minor problem in LayerSetup of compress_conv_layer.cpp
- How to resume the training using solverstate? HOT 1
- No regularization?
- help HOT 1
- Cannot Compress Model HOT 2
- the hyper-parameters in the paper HOT 2
- training problem HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dynamic-network-surgery.