fengfu-chris / caffe-twns Goto Github PK

Implementation of Ternary Weight Networks In Caffe

Home Page: https://arxiv.org/abs/1605.04711

Makefile 0.78% C++ 90.66% Cuda 6.51% CMake 0.10% Protocol Buffer 1.88% Python 0.07% Shell 0.01%

caffe-twns's Introduction

Ternary Weight Networks (TWNs)

This repository implements the benchmarks in our paper "Ternary Weight Networks" which was accepted by the 1st NIPS Workshop on Efficient Methods for Deep Neural Networks (EMDNN), 2016.

Please cite TWNs in your publications if it helps your research:

@article{li2016ternary,
  Author = {Li, Fengfu and Zhang, Bo and Liu, Bin},
  Journal = {arXiv preprint arXiv:1605.04711},
  Title = {Ternary Weight Networks},
  Year = {2016}
}

Build

Dependencies are identical with the master branch of Caffe. Check out project site for detailed instructions.

NOTE:

Some layers may only have GPU implementation. Thus, CUDA support and GPU devices are required.
The Makefile has been modified to accomodate Ubuntu 16.04. For previous version of Ubuntu, please replace the Makefile with the original one.

Steps to run a demo

Preparing data
$./data/mnist/get_mnist.sh
Converting data to lmdb
$./examples/mnist/create_mnist.sh
Configurations
3.1 setting the PRECISION in the train_lenet_tn.sh
3.2 setting the DELTA value (0.7 default)
Training
$cd examples/mnist
$sh train_lenet_tn.sh
Run-time usage (to be added)

Contact

You are welcome to send message to ([email protected]) if you have any issue on this code.

caffe-twns's People

Contributors

Stargazers

Watchers

caffe-twns's Issues

Per Layer Ternarization

The boolean controlling ternarization in binary.hpp seems to affect all inner_product and conv layers.
That suggests that either all layers are either ternarized or not. Is there a way to ternarize only some layers ?

I'm thinking of creating a new layer type inner_product_twn and conv_twn and revert the original inner_product and conv layers to baseline caffe behavior. Do you see any problems with this approach ?

Also, the alpha and delta are computed separately for each layer correct ?

[Question] Are delta and alpha layer wise or filter wise?

Hi,

I wonder if alpha and delta are layer based (as clained in Trained Ternary Quantization paper) or filter based (like XNOR-Net). I assumed they are filter based because your paper mention n as the filter size, but after reading the TTQ paper in which they talk about TWN having layer wise alpha and delta, I am not too sure.

Can you clarify?

regards,

Khoi

Know what is the use of a root_solver_ in caffe twn

Hi @fengfu-chris

I am looking at the code of solver.cpp and I see a new variable root_solver_. can you help me understand it function of this variable?

Also I wanted to ask you, How have you utilized the GPU to train your binary and ternary weight models? GPU as of now doesn't supports binary operands. Have you stored your binary weights in float and then trained on GPU? Does this need modification to the GPU Kernel?

Thanks,
Ayushi

Extract feature

您好，我用后缀为.tn的model提取特征进行识别任务，运行程序报错：
libprotobuf FATAL /usr/local/include/google/protobuf/repeated_field.h:613] CHECK failed: (index) < (size()):
terminate called after throwing an instance of 'google::protobuf::FatalException'
what(): CHECK failed: (index) < (size()):
提示这个错误我猜是模型和网络文件不匹配，应该是存储的三值权重导致模型读不进去，无法提取特征，不知道您有没有做过这方面的尝试？有什么好的建议解决这个问题吗？谢谢。

Passthrough is not supported, GL is disabled error

Hello !

I tried cloning the model and running it according to the mentioned steps. I ended up with the following error -
[1732:0417/102941.488:ERROR:gpu_init.cc(440)] Passthrough is not supported, GL is disabled
This happened when I ran the file create_mnist.sh . How do I fix it?

demo

您好，我看论文里面说三值网络可以实现16或32倍的压缩，但我跑了lenet在mnist数据集上的demo：
LOG= ./build/tools/caffe train --gpu=0 --precision=ternary --delta=7 --solver=./examples/mnist/lenet_tn_solver.prototxt --debug=no 2>&1 | tee $LOG
训练的结果模型为：lenet_tn_iter_30000.caffemodel 大小为：2.3 MB (2,339,071 字节)
并没有实现压缩，模型好像也变大了，由原来的1.7M变成了现在的2.3M

Build error

Hi @fengfu-chris ,
Do you have any idea how to solve the following error.

make: *** [.build_release/tools/caffe.bin] Error 1
.build_release/lib/libcaffe.so: undefined reference to `caffe::Layer<float>::BinaryToProto(caffe::LayerParameter*)'

make: *** [.build_release/tools/upgrade_net_proto_text.bin] Error 1
.build_release/lib/libcaffe.so: undefined reference to `caffe::Layer<float>::BinaryToProto(caffe::LayerParameter*)'

make: *** [.build_release/tools/upgrade_net_proto_binary.bin] Error 1
.build_release/lib/libcaffe.so: undefined reference to `caffe::Layer<float>::BinaryToProto(caffe::LayerParameter*)'

请问编译是否需要开cudnn？

如题，谢谢！

could you specify which layer you have added or modified ？

Nice work 。Could you specify which layer you have added or modified ，so I can add it to my caffe ？Thank you very much。

What's the base version of this code?

Thanks for your code, I want to merge it to my own.
So could you please tell me what's the base version in BVLC/caffe?

caffe test can't get the same precise

Thanks for sharing these code!

I train the mnist example with your code and save the ".tn" weights.
Then I use caffe test to load the model by "--weights=....".
Using ".caffemodel", I get a high accuracy(about 0.98).
But when it change to the corresponding ".caffemodel.tn", the accuracy drop to about 0.1 .

I then use the debug model to run the code. The debug information shows that it apply the twn reading mode to read the *.caffemodel.tn" file but the values seem to be wrong.

Can you tell me where I went wrong and how can I load your ".caffemodel.tn" weights into the net?

Something else:
Also, I read the issue "the weight of InnerProduct layer #4", reading the code help me a little but not enough. Maybe because I am a new learner.

Training on own dataset

@fengfu-chris , I would be glad, if you can tell me the training process of TWN with own dataset. What are the requirement and steps i need to follow. Thanks

the weight of InnerProduct layer

你好，我把网络连接改成102410241024*10的全连接，利用python接口拿出solver.net.params['ip1'][0].data
即为全连接层的权重，为什么不是三值的？是不是我哪里做的有问题？

Getting build error

Hi @fengfu-chris , I am tried to build caffe-twns in my ubuntu 14.04 LTS. But i am getting some cudnn error like below i mentioned. In my system cuda is working fine. May i know how to solve the problem.

sanjay@sanjay:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61

make: *** [.build_release/src/caffe/layers/memory_data_layer.o] Error 1
In file included from ./include/caffe/util/device_alternate.hpp:40:0,
                 from ./include/caffe/common.hpp:19,
                 from ./include/caffe/blob.hpp:8,
                 from ./include/caffe/data_transformer.hpp:7,
                 from src/caffe/layers/data_layer.cpp:8:
./include/caffe/util/cudnn.hpp: In function ‘void caffe::cudnn::createPoolingDesc(cudnnPoolingStruct**, caffe::PoolingParameter_PoolMethod, cudnnPoolingMode_t*, int, int, int, int, int, int)’:
./include/caffe/util/cudnn.hpp:127:41: error: too few arguments to function ‘cudnnStatus_t cudnnSetPooling2dDescriptor(cudnnPoolingDescriptor_t, cudnnPoolingMode_t, cudnnNanPropagation_t, int, int, int, int, int, int)’
         pad_h, pad_w, stride_h, stride_w));
                                         ^
./include/caffe/util/cudnn.hpp:15:28: note: in definition of macro ‘CUDNN_CHECK’
     cudnnStatus_t status = condition; \
                            ^
In file included from ./include/caffe/util/cudnn.hpp:5:0,
                 from ./include/caffe/util/device_alternate.hpp:40,
                 from ./include/caffe/common.hpp:19,
                 from ./include/caffe/blob.hpp:8,
                 from ./include/caffe/data_transformer.hpp:7,
                 from src/caffe/layers/data_layer.cpp:8:
/usr/local/cuda/include/cudnn.h:799:27: note: declared here
 cudnnStatus_t CUDNNWINAPI cudnnSetPooling2dDescriptor(
                           ^
make: *** [.build_release/src/caffe/layers/data_layer.o] Error 1

question about training CIFAR10

您好，当我在用论文提供的网络结构以及初始学习率训练cifar10的时候，发现无法训练，Loss爆炸了直接nan。您的VGG7参考网络结构为“2×(128-C3) + MP2 +2×(256-C3) + MP2 + 2×(512-C3) + MP2 + 1024-FC + Softmax。想请教您两个问题：
1.在1024-FC层之前，特征图的大小为batch * 512 * 4 * 4，请问这个1024FC是如何做到把8192变成10的维度的？
2.其次，按照BPWNs的网络结构 2×1024F C)−10SVM，以您参考的base_lr=0.1训练的loss是nan，请问这是什么原因？
蟹蟹

hello,fengfu,the idea is very nice.i am really want to know when the code is public available ?

请问近似解是如何求得的？

您好，我最近有幸读到了您的文章，但是看到了这里产生了疑惑。您能为我解答下这里的a/3，和0.6σ是如何得到的呢？

accuracy only 89% for CIFAR10?

Paper says accuracy can up to 92.56%.
Anything wrong with my training process?

net: "vgg7.prototxt"
test_iter: 100
test_interval: 1000
base_lr: 0.1
momentum: 0.9
weight_decay: 0.0001
lr_policy: "multistep"
gamma: 0.1
stepvalue: 40000 #80(epoch) * 500
stepvalue: 60000 #120 * 500
display: 100
max_iter: 100000
snapshot: 2000
snapshot_prefix: "/home/wxxu/caffe-TWN/examples/cifar/models/vgg7"
solver_mode: GPU

Any idea or suggestion to increase the computation speed of TWNs

Dear @fengfu-chris ,
Thank you so much for good work. I can see in your network that accuracy very much close to real value network. I would be wonder if you can suggest any idea to increase the computation speed. So that people can used your concept in real time application development. Thanks

Blob ternarization

In the Blob::TernaryToProto() function we seem to be using 0.5*alpha as delta instead of the actual value of delta. Is there a reason for doing it this way ?

const double* data_vec = cpu_binary();
for (int i = 0; i < count_; i += 16) {
unsigned int n = 0;
for(int j=0; j<16 && (i+j)<count_; j++){
int b1 = data_vec[i+j] > -0.5alpha ? 1 : 0;
int b2 = data_vec[i+j] > 0.5alpha ? 1 : 0;
n ^= (b1 << (2j));
n ^= (b2 << (2j+1));
}
proto->add_ternary_data(n);
}