Comments (11)
Ok try to fix the precision by adding
gradParameters:mul(1e+5)
after line 184 in train.lua
from xnor-net.
Lets double check few things first:
1- Could you get the same accuracy with the pretrained models?
2- Could you train the BWN?
3- I have noticed in some versions of cudnn the precision of division makes issues in convergence. If you are using adam you can multiply all the gradients by a large number to prevent the precision error which leads to NaN.
from xnor-net.
hi @mrastegari
The accuracy I get for two pretrained models are 56.67 and 42.37 respectively. I can train BWN, but I stopped at epoch #30, top-1 accuracy is 25.57. But the training was several weeks ago before you fixed some bugs. But for XNOR-net, I am not able to make training converge all the time. I don't know if others encounter the same issue.
from xnor-net.
Just to make sure, add gradParameters:mul(1e+5)
after updateBinaryGradWeight(convNodes)
, right? It still doesn't work for me. Has anyone experienced the same issue?
from xnor-net.
After how many iteration you see the divergence? Also try to follow the paper by replacing the updateBinaryGradWeight
function by:
function updateBinaryGradWeight(convNodes)
for i =2, #convNodes-1 do
local n = convNodes[i].weight[1]:nElement()
local s = convNodes[i].weight:size()
convNodes[i].gradWeight[convNodes[i].weight:le(-1)]=0;
convNodes[i].gradWeight[convNodes[i].weight:ge(1)]=0;
convNodes[i].gradWeight:add(1/(n)):mul(1-1/s[2]);
end
if opt.nGPU >1 then
model:syncParameters()
end
end
from xnor-net.
Hi @mrastegari
Thanks for your help. But it still doesn't work for me. The training starts to diverge at the very beginning with err=nan. I start to retrain BinaryNet now. BinaryNet seems to work very well for now. XnorNet never works for me.
from xnor-net.
I just pushed a modification can you check that?
from xnor-net.
Thanks for your help. At first, I change '-cache' to './cache/'. It still doesn't work. Error becomes 'nan' at the beginning all the time even I run the experiments dozens of times and with different random seeds. Has anyone successfully reproduce the XNOR experiments yet? I am confused. BTW, I re-run the Binary-Net experiment, I can get the accuracy of 51.65% in the end. Does the xnor code work very well for you? What problem do you think it is?
from xnor-net.
There is definitely something wrong with your setup. I asked a friend to try on his machine and he could reproduce the same result ~43%. Which version of Binary-Net are you using? 51.65% top-1 is too good for binary-input-and-binary-weight. Do you have a code for that?
from xnor-net.
hi @mrastegari
I found the problem, which is running multiple gpus. When I switched to 1 gpu, the model started to converge. For multiple-gpu version, maybe I used different CUDA and cuDNN versions. Could you share which version do you use? Thanks!
from xnor-net.
I use cuda 7.5 and cudnn 5. I also had this problem with GPUs on some of the machines that had mainboard incompatibility with GPUs
from xnor-net.
Related Issues (20)
- XNOR-Net in Tensorflow HOT 5
- How to make nn.Linear binary weights
- Questions About Mean Centering & Clamping HOT 1
- Mini XNOR-Net for MNIST HOT 1
- The order of imagenet classes used for training the XNOR-Net HOT 1
- How to train on custom data?
- performing prediction on single image. HOT 1
- Problem about prediction score when testing an image HOT 1
- Convolution using XNOR & bitcounting
- Trained XNOR-Network model for other frameworks? HOT 4
- Reproducing resnet18 results HOT 2
- how to train on CIFAR-10 dataset
- BinActiveZ:updateGradInput() dependance on input
- Where is the XNOR operator implemented? HOT 2
- Will I get ~32x speedup on your XNOR implementation? HOT 1
- Use google collab to run pretrained models
- Trained on large network
- Model size has no reduction HOT 1
- BinConvolution doen't seem to match paper HOT 1
- About the paper HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from xnor-net.