Sparse Variational Dropout, ICML 2017

License: GNU General Public License v3.0

Python 48.69% Jupyter Notebook 51.31%

variational-dropout-sparsifies-dnn's People

Contributors

Stargazers

Watchers

variational-dropout-sparsifies-dnn's Issues

Details for reproducing LeNet-5 results in ICML paper

Can you please specify the training details for generating the LeNet5 pruning results in your paper? Did you pretrain the network or use the warm-up procedure for the KL divergence term? What learning rate did you use?

Additive Noise Reparameterization

Hi Arsenii,

I have noticed quite strange thing in your code.

https://github.com/ars-ashuha/variational-dropout-sparsifies-dnn/blob/master/nets/layers.py#L89
si = T.sqrt(T.dot(input * input, T.exp(log_alpha) * self.W * self.W)+1e-8)

Instead of multiplying input^2 with T.exp(log_sigma2), you obtain sigma^2 from alpha * W^2 and thus additive reparameterization takes no effect in the case of big alpha.

The same thing you have in conv layers.

failed to run on CPU

Hi guys, I have a question. Is GPU mandatory to run lenet-5?

I tried to run it on CPU:
THEANO_FLAGS='floatX=float32,device=cpu,lib.cnmem=1' ipython experiments/lenet/lenet5-ard.py
and got the error:

ImportError Traceback (most recent call last)
variational-dropout-sparsifies-dnn/experiments/lenet/lenet5-ard.py in ()
5 from nets import objectives
6 from theano import tensor as T
----> 7 from nets import optpolicy, layers
8 from lasagne import init, nonlinearities as nl, layers as ll
9 from lasagne.layers.dnn import Pool2DDNNLayer as MaxPool2DLayer

variational-dropout-sparsifies-dnn/nets/layers.py in ()
5 from lasagne.nonlinearities import rectify
6 from lasagne.layers.base import Layer
----> 7 from lasagne.layers.dnn import Conv2DDNNLayer as ConvLayer
8 from theano.sandbox.cuda import dnn
9 from theano.sandbox.rng_mrg import MRG_RandomStreams as RandomStreams

lib/python2.7/site-packages/lasagne/layers/dnn.py in ()
40 else:
41 raise ImportError(
---> 42 "requires GPU support -- see http://lasagne.readthedocs.org/en/"
43 "latest/user/installation.html#gpu-support") # pragma: no cover
44

ImportError: requires GPU support -- see http://lasagne.readthedocs.org/en/latest/user/installation.html#gpu-support

How do you choose the threshold?

Dear author,
I found that you choose 3 as threshold for the mask of weights, I am curious how did you pick up this value?
Thanks a lot!
Feng

Will you make the code for the algorithm itself available?

Is it possible to compress already existing trained network?

cudnn version?

Hi I am not able to run the experiments except the lenet300-100. The error I got corresponds to the convolution layer. Also, recently theano changes backend to gpuarray. So I wonder should I use specific version of cuda/cudnn? or any additional config of using gpuarray?

Cross entropy loss mutliplied by training set size

I was running the Colab notebook (Pytorch) and had 2 questions:

Why is the loss going below 0? Is it due to the regularisation term added for the dropout? Also, I noticed that only when the loss went negative, I got a good compression ratio
Why is cross-entropy loss multiplied by the training set size for each batch, since we usually divide the epoch loss by the dataset size?
I would really appreciate it if you could help me with my queries.

How to approximate the D_KL?

Could you please share codes of your method that can be used to approximate D_KL(k_1, k_2 and k_3)?

How to run examples on CPU?

I tried to run it on CPU:
THEANO_FLAGS='floatX=float32,device=cpu,lib.cnmem=1' ipython experiments/lenet/lenet5-ard.py
and got the error:
lib/python2.7/site-packages/lasagne/layers/dnn.py in ()
40 else:
41 raise ImportError(
---> 42 "requires GPU support -- see http://lasagne.readthedocs.org/en/"
43 "latest/user/installation.html#gpu-support") # pragma: no cover
44

ImportError: requires GPU support -- see http://lasagne.readthedocs.org/en/latest/user/installation.html#gpu-support
After i read a answer “you replace GPU Convolution (dnn.dnn_conv) in Conv2DVarDropOutARD on CPU one it will fix the issue.”
I find GPU Convolution(dnn.dnn_conv in /home/tom/variational-dropout-sparsifies-dnn/nets/layers.py，but i am not familiar to theano,has anyone tried to change this to CPU？？？？
if deterministic:
conved = dnn.dnn_conv(img=input, kerns=T.switch(T.ge(log_alpha, thresh), 0, self.W),
subsample=self.stride, border_mode=border_mode,
conv_mode=conv_mode)
else:
W = self.W
if train_clip:
W = T.switch(clip_mask, 0, W)
conved_mu = dnn.dnn_conv(img=input, kerns=W,
subsample=self.stride, border_mode=border_mode,
conv_mode=conv_mode)
conved_si = T.sqrt(1e-8+dnn.dnn_conv(img=input * input, kerns=T.exp(log_alpha) * W * W,
subsample=self.stride, border_mode=border_mode,
conv_mode=conv_mode))
conved = conved_mu + conved_si * self._srng.normal(conved_mu.shape, avg=0, std=1)
return conved

How to generate the dynamic image that shows the sparsity of weights through out the training as shown in README

Hi, may I please ask how did you generate the dynamic picture as you did in README?

Блокнот и пример

Арсений, добрый день!
В продолжение вчерашнего разговора:

залил форк for Python 3.5 and Theano 1.0.0 https://github.com/nnnet/Variational-dropout-sparsifies-dnn с последними Вашими коммитами
по примерам *.ipynb напишу в мессенджер

bayesgroup / variational-dropout-sparsifies-dnn Goto Github PK

variational-dropout-sparsifies-dnn's People

Contributors

Stargazers

Watchers

Forkers

variational-dropout-sparsifies-dnn's Issues

Recommend Projects

Recommend Topics

Recommend Org