s9xie / dsn Goto Github PK
View Code? Open in Web Editor NEWDeeply-supervised Nets
Home Page: http://vcl.ucsd.edu/~sxie/2014/09/12/dsn-project/
License: Other
Deeply-supervised Nets
Home Page: http://vcl.ucsd.edu/~sxie/2014/09/12/dsn-project/
License: Other
I want to reproduce the result for SVHN using Deeply Supervised Nets.
However, I am new to Caffee. Is there configuration file to train and test SVHN dataset?
Thank you.
I am using DSN on Ubuntu 12.04 with Cuda-5.5.
I am getting following error on running ./train_full.sh
$ ./train_full.sh
I0622 00:04:40.815856 28860 train_net.cpp:26] Starting Optimization
I0622 00:04:40.816721 28860 solver.cpp:26] Creating training net.
I0622 00:04:40.816889 28860 net.cpp:70] Creating Layer data
I0622 00:04:40.816900 28860 net.cpp:105] data -> data
I0622 00:04:40.816915 28860 net.cpp:105] data -> label
I0622 00:04:40.816928 28860 data_layer.cpp:148] Opening leveldb cifar10_gcn_padded-leveldb/cifar-train-leveldb
F0622 00:04:40.816994 28860 data_layer.cpp:151] Check failed: status.ok() Failed to open leveldb cifar10_gcn_padded-leveldb/cifar-train-leveldb
IO error: cifar10_gcn_padded-leveldb/cifar-train-leveldb/LOCK: No such file or directory
*** Check failure stack trace: ***
@ 0x7fec1db18b7d google::LogMessage::Fail()
@ 0x7fec1db1ac7f google::LogMessage::SendToLog()
@ 0x7fec1db1876c google::LogMessage::Flush()
@ 0x7fec1db1b51d google::LogMessageFatal::~LogMessageFatal()
@ 0x45d17f caffe::DataLayer<>::SetUp()
@ 0x43a434 caffe::Net<>::Init()
@ 0x43bb2a caffe::Net<>::Net()
@ 0x426c7c caffe::Solver<>::Solver()
@ 0x40e79f main
@ 0x7fec1b8607ed (unknown)
@ 0x40fdcd (unknown)
Aborted (core dumped)
I0622 00:04:40.889854 28863 finetune_net.cpp:25] Starting Optimization
I0622 00:04:40.890429 28863 solver.cpp:26] Creating training net.
I0622 00:04:40.890560 28863 net.cpp:70] Creating Layer data
I0622 00:04:40.890580 28863 net.cpp:105] data -> data
I0622 00:04:40.890600 28863 net.cpp:105] data -> label
I0622 00:04:40.890625 28863 data_layer.cpp:148] Opening leveldb cifar10_gcn_padded-leveldb/cifar-train-leveldb
F0622 00:04:40.890717 28863 data_layer.cpp:151] Check failed: status.ok() Failed to open leveldb cifar10_gcn_padded-leveldb/cifar-train-leveldb
IO error: cifar10_gcn_padded-leveldb/cifar-train-leveldb/LOCK: No such file or directory
*** Check failure stack trace: ***
@ 0x7fb874271b7d google::LogMessage::Fail()
@ 0x7fb874273c7f google::LogMessage::SendToLog()
@ 0x7fb87427176c google::LogMessage::Flush()
@ 0x7fb87427451d google::LogMessageFatal::~LogMessageFatal()
@ 0x45a63f caffe::DataLayer<>::SetUp()
@ 0x42b564 caffe::Net<>::Init()
@ 0x42cc5a caffe::Net<>::Net()
@ 0x435d7c caffe::Solver<>::Solver()
@ 0x40e79f main
@ 0x7fb871fb97ed (unknown)
@ 0x40fdfd (unknown)
Aborted (core dumped)
I0622 00:04:40.968624 28866 finetune_net.cpp:25] Starting Optimization
I0622 00:04:40.969331 28866 solver.cpp:26] Creating training net.
I0622 00:04:40.969472 28866 net.cpp:70] Creating Layer data
I0622 00:04:40.969487 28866 net.cpp:105] data -> data
I0622 00:04:40.969501 28866 net.cpp:105] data -> label
I0622 00:04:40.969517 28866 data_layer.cpp:148] Opening leveldb cifar10_gcn_padded-leveldb/cifar-train-leveldb
F0622 00:04:40.969605 28866 data_layer.cpp:151] Check failed: status.ok() Failed to open leveldb cifar10_gcn_padded-leveldb/cifar-train-leveldb
IO error: cifar10_gcn_padded-leveldb/cifar-train-leveldb/LOCK: No such file or directory
*** Check failure stack trace: ***
@ 0x7fdde68fdb7d google::LogMessage::Fail()
@ 0x7fdde68ffc7f google::LogMessage::SendToLog()
@ 0x7fdde68fd76c google::LogMessage::Flush()
@ 0x7fdde690051d google::LogMessageFatal::~LogMessageFatal()
@ 0x45a63f caffe::DataLayer<>::SetUp()
@ 0x42b564 caffe::Net<>::Init()
@ 0x42cc5a caffe::Net<>::Net()
@ 0x435d7c caffe::Solver<>::Solver()
@ 0x40e79f main
@ 0x7fdde46457ed (unknown)
@ 0x40fdfd (unknown)
Aborted (core dumped)
In formulation (3), there is a factor \gamma. This parameter is setted to prevent the hinge loss to be 0. However, I haven't find this parameter in the code.
The 0 loss is quite common in deep learning and this phenomenon is usually called "overfit". In deep learning, people usually use dropout to prevent the loss from getting to zero too early.
Hi - I looked at the code and the ip_svm layer is simply an innerproduct.
Am I missing something - or SVM is not implemented here - thus just using a IP with a squared-hinge instead; with SGD maybe you consider that like an approximation of SVM?
In the paper you mention using a Theano implementation of DSN for experiments on MNIST. Is this code available?
Hi, I reproduce the experiment and appreciate that DSN do a great job. I try a idea to improve the SVM and get a bit improvement on CIFAR10.
I am curious about whether it would work on other dataset under DSN framework.
Does the author has a future plan to release other dataset's(MNIST, SVHN, etc.) configurations?
I notice that you comment the "relu_cccp6".If I recomment the layer "relu_cccp6", the loss will always be 10 after every iteration,why?
Dear author:
Have you ever later implemented "Deeply Supervised Networks" in Pytorch framework? Or, could you please kindly provide an implementation of DSN in pytorch version? Thanks a lot.
Best
It seems like you use the same skeleton with the NIN model bu the supervised layers. And any number is same like channel numbers and filter sizes. Even you introduces more params with supervised layers. however your paper states that number of parameters are arranged to keep the values comparable with the original NIN model.
I noticed that the weight of the loss layers are realised by setting the blob_lr parameter of the innerproduct layer in this project. This is equivalent to the formulation (3) for training the innerproduct layer. However, the gradients backpropagate to the bottom conv layer will not be influenced by the weight (0.001 in the prototxt file).
In another word, this realization just slowly learns the innerproduct layer of the previous SVMs, but applies the gradients of the classifiers equally to the nets. All SVMs have the same weight this way.
CAFFE has provided a param called "loss_weight", which is the correct method to realize the model described in the paper as far as I see.
This is all my opinion. If I were wrong, please reply me.
Would you please upload the cifar-100 net configure file? Thanks!!!
Where are the new code files added into caffe? How to make it runnable on the upate-to-date caffe? Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.