junhyukoh / caffe-lstm Goto Github PK
View Code? Open in Web Editor NEWLSTM implementation on Caffe
License: Other
LSTM implementation on Caffe
License: Other
My goal is to apply the example, lstm_short, to the latest caffe releases (now that LSTM support is available in caffe)
I understand the data is some kind of periodic signal that we will reconstruct.
The clip is a way to mark the beginning of the signal.
What is label in this example?
The example (which works well in this branch) is here.
https://github.com/junhyukoh/caffe-lstm/blob/master/examples/lstm_sequence/lstm_short.prototxt
Hi Junhyuk, your training regime learns fine.
Can you please shed some light on how the loss is calculated.
Is it calculated based on the output of a single label value or the entire 320 label values?
I ask because the ip1 layer has num_output as 1 (shown below)
Thank you.
layer {
name: "ip1"
type: "InnerProduct"
bottom: "lstm1"
top: "ip1"
inner_product_param {
num_output: 1
weight_filler {
type: "gaussian"
std: 0.1
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "loss"
type: "EuclideanLoss"
bottom: "ip1"
bottom: "label"
top: "loss"
include: { phase: TRAIN }
}
I am in real need for real examples, specially in vision area :)
Is it possible to feed such input to your network? The sample is so simple and I am not sure if such input is doable.
I am trying to use video frames as input to the network.
For example, if, on average, the video length is 160 frames, then, does this mean that batch_size should be at least 160, so that the LSTM layer has information of all the frames in the video sequence ?
I am guessing that if the batch_size is less than average video length, then the sequence would be broken and the LSTM layer would be learning from incomplete sequences. Am I correct ?
I have a question about caffe-lstm.
There are members N_ and T_ in class LstmLayer ( I found default value of N_ is 1 ) and their comments are batch size and length of sequence. I am not sure if that means: there are N_ independent sequences and each one of them has T_ frames. For example, If I have 5 videos and each one has 100 frames. I must set N_ = 5 and T_ = 100. Like this, if I have only one video with 100 frames, I must set N_ = 1 and T_ = 100.
But if I have 5 videos with different frames, say, video1 has 100 frames, video2 has 90 frames, videos3 has 96 frames, video4 has 88 frames, video5 has 99 frames, I feel N_ must be set 1 and use 'clip' to solve this training. If N_ is not 1, there is no solution.
Above is my understand about the members in LstmLayer class. Could you tell me if that is right? Thanks!
can anyone please tell me how to run the caffe-lstm on android?
Hi, I'd like to train a classification model with LSTM and Softmax layers. The database is very large and patially labeled. Can you tell me how to leverage the large-scale unlabeled data? Thanks!
Hi,
Any specific way to cite your library in a paper?
caffe-lstm / src / caffe / layers / lstm_layer.cpp 166行
caffe_add(4H_, pre_gate_t, h_to_gate, pre_gate_t) ---->caffe_add(4H_, pre_gate_t, h_to_gate_t, pre_gate_t)
与185行
h_to_gate_t += 4*H_对应
Hi,
Sometimes, when i have some netowrk/its data, I run over CPU it works well. I run over GPU it gives NAN in soft max outputs and bad accuracy. This happen if network starts from a previous computed weights or random intalization.
Note, same netowkr may work after a while normally. When this concern happens, it happens in all my netowrks based on LSTM layer. In same time, If i removed the layer, caffe works well in GPU.
Is it possible the library has bug in GPU part? any help in that
HI,
I ran ./test_lstm_long.sh and other scripts on an Ubuntu box, and the result in the log file is flat:
...
-0.138143 0.087998
-0.118875 0.087998
-0.0895481 0.087998
-0.0522323 0.087998
-0.00979323 0.087998
...
Any suggestions?
Also, by default the mode is CPU, and the scripts crash when the flag in *solver.prototxt is changed to GPU. I use C2070, which works fine with other caffe projects.
thanks-
GTB
In your .sh file, nothing with .bin file?
Hello, i installed caffe without any problem but when i'm trying to build caffe-lstm i have next issues:
[ 86%] Linking CXX executable train_net
../lib/libcaffe.so: undefined reference to `H5LTget_dataset_ndims'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_int'
../lib/libcaffe.so: undefined reference to `H5LTfind_dataset'
../lib/libcaffe.so: undefined reference to `caffe::BlockingQueue<caffe::Batch<double>*>::pop(std::string const&)'
../lib/libcaffe.so: undefined reference to `google::base::CheckOpMessageBuilder::NewString()'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_double'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_int'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_float'
../lib/libcaffe.so: undefined reference to `google::protobuf::internal::NameOfEnum(google::protobuf::EnumDescriptor const*, int)'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_float'
../lib/libcaffe.so: undefined reference to `H5LTget_dataset_info'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_string'
../lib/libcaffe.so: undefined reference to `caffe::BlockingQueue<caffe::Batch<float>*>::pop(std::string const&)'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_double'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_string'
collect2: error: ld returned 1 exit status
make[2]: *** [tools/CMakeFiles/test_net.dir/build.make:127: tools/test_net] Error 1
make[1]: *** [CMakeFiles/Makefile2:510: tools/CMakeFiles/test_net.dir/all] Error 2
../lib/libcaffe.so: undefined reference to `H5LTget_dataset_ndims'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_int'
../lib/libcaffe.so: undefined reference to `H5LTfind_dataset'
../lib/libcaffe.so: undefined reference to `caffe::BlockingQueue<caffe::Batch<double>*>::pop(std::string const&)'
../lib/libcaffe.so: undefined reference to `google::base::CheckOpMessageBuilder::NewString()'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_double'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_int'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_float'
../lib/libcaffe.so: undefined reference to `google::protobuf::internal::NameOfEnum(google::protobuf::EnumDescriptor const*, int)'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_float'
../lib/libcaffe.so: undefined reference to `H5LTget_dataset_info'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_string'
../lib/libcaffe.so: undefined reference to `caffe::BlockingQueue<caffe::Batch<float>*>::pop(std::string const&)'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_double'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_string'
collect2: error: ld returned 1 exit status
make[2]: *** [tools/CMakeFiles/train_net.dir/build.make:127: tools/train_net] Error 1
make[1]: *** [CMakeFiles/Makefile2:548: tools/CMakeFiles/train_net.dir/all] Error 2
In file included from /usr/include/boost/type_traits/ice.hpp:15:0,
from /usr/include/boost/python/detail/def_helper.hpp:9,
from /usr/include/boost/python/class.hpp:29,
from /usr/include/boost/python.hpp:18,
from /home/standnail/Git/caffe-lstm/tools/caffe.cpp:2:
/usr/include/boost/type_traits/detail/ice_or.hpp:17:71: note: #pragma message: NOTE: Use of this header (ice_or.hpp) is deprecated
# pragma message("NOTE: Use of this header (ice_or.hpp) is deprecated")
^
In file included from /usr/include/boost/type_traits/ice.hpp:16:0,
from /usr/include/boost/python/detail/def_helper.hpp:9,
from /usr/include/boost/python/class.hpp:29,
from /usr/include/boost/python.hpp:18,
from /home/standnail/Git/caffe-lstm/tools/caffe.cpp:2:
/usr/include/boost/type_traits/detail/ice_and.hpp:18:72: note: #pragma message: NOTE: Use of this header (ice_and.hpp) is deprecated
# pragma message("NOTE: Use of this header (ice_and.hpp) is deprecated")
^
In file included from /usr/include/boost/type_traits/ice.hpp:17:0,
from /usr/include/boost/python/detail/def_helper.hpp:9,
from /usr/include/boost/python/class.hpp:29,
from /usr/include/boost/python.hpp:18,
from /home/standnail/Git/caffe-lstm/tools/caffe.cpp:2:
/usr/include/boost/type_traits/detail/ice_not.hpp:17:72: note: #pragma message: NOTE: Use of this header (ice_not.hpp) is deprecated
# pragma message("NOTE: Use of this header (ice_not.hpp) is deprecated")
^
In file included from /usr/include/boost/type_traits/ice.hpp:18:0,
from /usr/include/boost/python/detail/def_helper.hpp:9,
from /usr/include/boost/python/class.hpp:29,
from /usr/include/boost/python.hpp:18,
from /home/standnail/Git/caffe-lstm/tools/caffe.cpp:2:
/usr/include/boost/type_traits/detail/ice_eq.hpp:17:71: note: #pragma message: NOTE: Use of this header (ice_eq.hpp) is deprecated
# pragma message("NOTE: Use of this header (ice_eq.hpp) is deprecated")
^
[ 86%] Linking CXX executable caffe
../lib/libcaffe.so: undefined reference to `H5LTget_dataset_ndims'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_int'
../lib/libcaffe.so: undefined reference to `H5LTfind_dataset'
../lib/libcaffe.so: undefined reference to `caffe::BlockingQueue<caffe::Batch<double>*>::pop(std::string const&)'
../lib/libcaffe.so: undefined reference to `google::base::CheckOpMessageBuilder::NewString()'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_double'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_int'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_float'
../lib/libcaffe.so: undefined reference to `google::protobuf::internal::NameOfEnum(google::protobuf::EnumDescriptor const*, int)'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_float'
../lib/libcaffe.so: undefined reference to `H5LTget_dataset_info'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_string'
../lib/libcaffe.so: undefined reference to `caffe::BlockingQueue<caffe::Batch<float>*>::pop(std::string const&)'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_double'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_string'
collect2: error: ld returned 1 exit status
make[2]: *** [tools/CMakeFiles/caffe.bin.dir/build.make:127: tools/caffe] Error 1
make[1]: *** [CMakeFiles/Makefile2:472: tools/CMakeFiles/caffe.bin.dir/all] Error 2
make: *** [Makefile:128: all] Error 2
I set blas to openblas like cmake -DBLAS=open ../caffe-lstm/ and fixed in make.config file BLAS := open
Doing it on archlinux
any solution?
Hello there,
I have variable-sized sequences as an input and each token in the sequence is encoded in a D dim vector.
To store the data efficiently they are zero padded, that is, each sequence is stored as a DxM matrix, for
the first sequence, only the first m1 columns are non-zeros, and for the second sequence only the first
m2 columns are non-zeros, and so on.
I was wondering how to feed this kind of input into the network, so the padded part is ignored as it
should be, perhaps by setting "clip" somehow?
Many thanks for your help.
CC
I was tying to use your data layer to write an auto text generator. But then I realized it is impossible to do so because I have to input data that is dividable by batch_size but I can only input one by one when testing the layer.
At training time, I know the whole text so I can have a batch size bigger than one. For example, for the word hello, my data are [none h e l l] and my labels are [h e l l o]
But at test time. For example, I give the net 0(for none) at the beginning of the testing, ideally, the net predicts h,then I use h as input .The input is produced by the net so I can't have a batch size bigger than one at test time. And your implenmentation doesn't allow us to change batch size.
Am I right?
Hi, I am new to lstm. I find that lstm layer will output T * num_output for T sequence, However, I want to build a many to one model, what should I do?
I have tried caffe command line to train some model with this project. However I found the result so strange. Is caffe command line available for LSTM layers?
I am trying to generate a brief description of input images in English. For that purpose, I am using CNN and LSTM. So far, I am done with CNN module, in which I get a 4096-dimensional vector as the output of fc7 of my caffemodel (VGG net of 16 layers). Also, if I add the SoftMax layer on top of CNN, I am able to get class labels as follows:
e.g. an image of a person with mobile in his hand sitting on a bed.
So far, I get class labels like 'person', 'mobile', 'bed'.
Now, I wish to generate a sentence from these words or by using the 4096-feature vector I get as CNN output.
I want to feed a sequence of 10 vectors into the LSTM and for those 10 inputs I have 1 output label. I don't have (and don't want to have) labels for intermediate results from LSTM (as such, I just want 1 final output from LSTM for 10 input vectors). Illustrated below:
Input vector --> Label
vector_1 --> n/a
vector_2 --> n/a
vector_3 --> n/a
.
.
.
vector_10 --> final_label
Is it possible to do something like this using this implementation of LSTM's ? If so, can you please just give a starting point ?
Hi, I'd like to know if we do the lstm in N batch and obtain the diff from them, how to update the weight for that layer? thanks.
Very nice code and examples.
Thanks for sharing.
Do you have any hint/suggestion on how to implement a Clockwork RNN in Caffe with C++?
Thanks
Hey guys, now I do the action recognition with this caffe-lstm. I want to input multi frames and output one. How to do that? And what should I do in the prototxt? Thank you
Hello I am implementing LSTM over C3D features and getting the following error
lstm_layer.cu:172] Check failed: error == cudaSuccess (9 vs. 0) invalid configuration argument
Could you please help me out with resolving the same. I am working with Cuda 8, cuDNN 5.0.5 with Quadro K2200 GPU
This seems like the right place to get answers to Caffe LSTM questions :-). You can count on an answer.
I'm comparing the implementation of the LSTM layer over here and the (official merged) one in Caffe. They are different.
Are they conceptually the same relative to the clip_marker implementation?
My question is, if the sequence lengths are the same in the input (i.e., they don't vary) and they match the number of time-steps, then do we need to provide the clip_marker input (in the official caffe version)?
Can the network assume it to be [0,1,1, 1...,1]?
My reason to ask this is to debug the network. My own markers may be in error and likely confusing the network?
Thank you.
Hi!
I am trying to train a network that generate some poems. I can not find a good and simple example for using lstm for text generation in caffe. my first question is how I must send data to the input layer? i want to use HDF5 data format for input, I think every record has some words as input words, an array of clips and a label that is the next word following input words. in start of every poem clips would be reset to zeros. am I correct?? now my HDF5 file should have 3 array?? and after this step how I should write .prototxt files and define layers?? can you explain these to me or give me an example codes??
Thanks, in advance!
I'm porting your simple LSTM example to the Caffe mainline tree. As expected some keywords and parameters are different as the implementations were independently developed.
My question is about the clipping_threshold
parameter.
In your lstm implementation, I see (in the backward lstm computation):
// Clip deriviates before nonlinearity
if (clipping_threshold_ > Dtype(0.)) {
caffe_bound(4*H_, pre_gate_diff_t, -clipping_threshold_,
clipping_threshold_, pre_gate_diff_t);
}
I don't see this in the caffe mainline code. Here, the clip_gradient is converted into a scale_factor.
Dtype scale_factor = clip_gradients / l2norm_diff;
Is it the same parameter? Does it have the same effect? Is one the scaled version of the other?
Could you help with your insight?
Thank you
Auro
As you mentioned in caffe official repo, your implementation doesn't support mini-batch. What is your plan to extend your implementation? To support N-sized truncated BPTT with M minibatch, is introducing M sequential data layers good enough?
After reading the sample in http://caffe.berkeleyvision.org/tutorial/layers.html, I know 'lr_mult' is learning rate multipliers for the weights or the biases. But there are 3 'lr_mult' in deep_lstm_short.prototxt, what does the third lr_mult mean?
I am still a novice in caffe, sorry to disturb you.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.