junhyukoh / caffe-lstm Goto Github PK

View Code? Open in Web Editor NEW

491.0 491.0 248.0 32.13 MB

LSTM implementation on Caffe

License: Other

Makefile 0.67% Shell 0.38% C++ 79.88% Python 8.31% MATLAB 0.95% Cuda 5.34% Protocol Buffer 1.51% CMake 2.96% M 0.01%

caffe-lstm's People

Contributors

Stargazers

Watchers

Forkers

amoliu wchtiger yiiwood 7404n czlx0701 nakosung jiayong yanqingmen voidexception jingtaow zhangyangang orangelpai fireae wwhu chagge onlysang rayz0620 birdgun amiltonwong futhermocker wkal wuntoguo guokr1991 euwen michaelxin ct1104 ccsquare liumangtu infixz wangdongfrank kuyun-zhangyang xiaozhuka feidong1991 kingulight jiehanwang colingogo xaccc lanlianhuaer lixiangnlp liu4lin xiangliu886 hylhero junlongsun kendemu onecue mudelin vangogh0318 chuckcho skylian nerei disenchan cmxnono corsy chwick cqduan xuezhisd starimpact ominux vzhangmeng726 yanlinqian tifoit 1292765944 xuguozhi hugo-w guoyilin kdjyss benjamesbabala empty16 ding00 icemansina catsdogone rangozhang matrix-revolution iamweiliu song888k haiboshi caomw wangxiong2015 txd866 lyqsr liulei2776 szubair-nara chunxia75qin zhangxinnan wait1988 zhuwj yikuizhai githubfragments leo-zhou qingsong99 einsiedler0408 trigrass2 silasxue simple555a squidszyd wanggcong haohaohaohaohaohaozhang sgtrouge visionthinking alexanderpu

caffe-lstm's Issues

Meaning of label in lstm_short example

My goal is to apply the example, lstm_short, to the latest caffe releases (now that LSTM support is available in caffe)

I understand the data is some kind of periodic signal that we will reconstruct.
The clip is a way to mark the beginning of the signal.
What is label in this example?

The example (which works well in this branch) is here.
https://github.com/junhyukoh/caffe-lstm/blob/master/examples/lstm_sequence/lstm_short.prototxt

Loss layer calculation

Hi Junhyuk, your training regime learns fine.

Can you please shed some light on how the loss is calculated.

Is it calculated based on the output of a single label value or the entire 320 label values?

I ask because the ip1 layer has num_output as 1 (shown below)

Thank you.

layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "lstm1"
  top: "ip1"

  inner_product_param {
    num_output: 1
    weight_filler {
      type: "gaussian"
      std: 0.1
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "loss"
  type: "EuclideanLoss"
  bottom: "ip1"
  bottom: "label"
  top: "loss"
  include: { phase: TRAIN }
}

How to input sequences of feature vectors to LSTM?

I am in real need for real examples, specially in vision area :)

I have K images. Through Caffe Feature Extraction, I extracted K feature vectors in leveldb format.
Each feature vector of length 4096
These K images, are actually blocks of frames, each block (say of 10 images) has a label (say video action such as Running)

Is it possible to feed such input to your network? The sample is so simple and I am not sure if such input is doable.

Does the batch_size have to be at least as large as the sequence length ?

I am trying to use video frames as input to the network.
For example, if, on average, the video length is 160 frames, then, does this mean that batch_size should be at least 160, so that the LSTM layer has information of all the frames in the video sequence ?
I am guessing that if the batch_size is less than average video length, then the sequence would be broken and the LSTM layer would be learning from incomplete sequences. Am I correct ?

question about LstmLayer class.

I have a question about caffe-lstm.
There are members N_ and T_ in class LstmLayer ( I found default value of N_ is 1 ) and their comments are batch size and length of sequence. I am not sure if that means: there are N_ independent sequences and each one of them has T_ frames. For example, If I have 5 videos and each one has 100 frames. I must set N_ = 5 and T_ = 100. Like this, if I have only one video with 100 frames, I must set N_ = 1 and T_ = 100.
But if I have 5 videos with different frames, say, video1 has 100 frames, video2 has 90 frames, videos3 has 96 frames, video4 has 88 frames, video5 has 99 frames, I feel N_ must be set 1 and use 'clip' to solve this training. If N_ is not 1, there is no solution.

Above is my understand about the members in LstmLayer class. Could you tell me if that is right? Thanks!

running on android

can anyone please tell me how to run the caffe-lstm on android?

How to leverage the large-scale unlabeled data to train the LSTM model?

Hi, I'd like to train a classification model with LSTM and Softmax layers. The database is very large and patially labeled. Can you tell me how to leverage the large-scale unlabeled data? Thanks!

Library citation

Hi,

Any specific way to cite your library in a paper?

bug？

caffe-lstm / src / caffe / layers / lstm_layer.cpp 166行
caffe_add(4H_, pre_gate_t, h_to_gate, pre_gate_t) ---->caffe_add(4H_, pre_gate_t, h_to_gate_t, pre_gate_t)
与185行
h_to_gate_t += 4*H_对应

GPU vs CPU: NAN

Hi,

Sometimes, when i have some netowrk/its data, I run over CPU it works well. I run over GPU it gives NAN in soft max outputs and bad accuracy. This happen if network starts from a previous computed weights or random intalization.

Note, same netowkr may work after a while normally. When this concern happens, it happens in all my netowrks based on LSTM layer. In same time, If i removed the layer, caffe works well in GPU.

Is it possible the library has bug in GPU part? any help in that

training results are flat

HI,
I ran ./test_lstm_long.sh and other scripts on an Ubuntu box, and the result in the log file is flat:
...
-0.138143 0.087998
-0.118875 0.087998
-0.0895481 0.087998
-0.0522323 0.087998
-0.00979323 0.087998
...

Any suggestions?
Also, by default the mode is CPU, and the scripts crash when the flag in *solver.prototxt is changed to GPU. I use C2070, which works fine with other caffe projects.

thanks-
GTB

why not the lstm_swquence.bin file

In your .sh file, nothing with .bin file?

Issue with linking

Hello, i installed caffe without any problem but when i'm trying to build caffe-lstm i have next issues:

[ 86%] Linking CXX executable train_net
../lib/libcaffe.so: undefined reference to `H5LTget_dataset_ndims'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_int'
../lib/libcaffe.so: undefined reference to `H5LTfind_dataset'
../lib/libcaffe.so: undefined reference to `caffe::BlockingQueue<caffe::Batch<double>*>::pop(std::string const&)'
../lib/libcaffe.so: undefined reference to `google::base::CheckOpMessageBuilder::NewString()'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_double'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_int'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_float'
../lib/libcaffe.so: undefined reference to `google::protobuf::internal::NameOfEnum(google::protobuf::EnumDescriptor const*, int)'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_float'
../lib/libcaffe.so: undefined reference to `H5LTget_dataset_info'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_string'
../lib/libcaffe.so: undefined reference to `caffe::BlockingQueue<caffe::Batch<float>*>::pop(std::string const&)'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_double'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_string'
collect2: error: ld returned 1 exit status
make[2]: *** [tools/CMakeFiles/test_net.dir/build.make:127: tools/test_net] Error 1
make[1]: *** [CMakeFiles/Makefile2:510: tools/CMakeFiles/test_net.dir/all] Error 2
../lib/libcaffe.so: undefined reference to `H5LTget_dataset_ndims'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_int'
../lib/libcaffe.so: undefined reference to `H5LTfind_dataset'
../lib/libcaffe.so: undefined reference to `caffe::BlockingQueue<caffe::Batch<double>*>::pop(std::string const&)'
../lib/libcaffe.so: undefined reference to `google::base::CheckOpMessageBuilder::NewString()'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_double'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_int'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_float'
../lib/libcaffe.so: undefined reference to `google::protobuf::internal::NameOfEnum(google::protobuf::EnumDescriptor const*, int)'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_float'
../lib/libcaffe.so: undefined reference to `H5LTget_dataset_info'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_string'
../lib/libcaffe.so: undefined reference to `caffe::BlockingQueue<caffe::Batch<float>*>::pop(std::string const&)'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_double'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_string'
collect2: error: ld returned 1 exit status
make[2]: *** [tools/CMakeFiles/train_net.dir/build.make:127: tools/train_net] Error 1
make[1]: *** [CMakeFiles/Makefile2:548: tools/CMakeFiles/train_net.dir/all] Error 2
In file included from /usr/include/boost/type_traits/ice.hpp:15:0,
                 from /usr/include/boost/python/detail/def_helper.hpp:9,
                 from /usr/include/boost/python/class.hpp:29,
                 from /usr/include/boost/python.hpp:18,
                 from /home/standnail/Git/caffe-lstm/tools/caffe.cpp:2:
/usr/include/boost/type_traits/detail/ice_or.hpp:17:71: note: #pragma message: NOTE: Use of this header (ice_or.hpp) is deprecated
 # pragma message("NOTE: Use of this header (ice_or.hpp) is deprecated")
                                                                       ^
In file included from /usr/include/boost/type_traits/ice.hpp:16:0,
                 from /usr/include/boost/python/detail/def_helper.hpp:9,
                 from /usr/include/boost/python/class.hpp:29,
                 from /usr/include/boost/python.hpp:18,
                 from /home/standnail/Git/caffe-lstm/tools/caffe.cpp:2:
/usr/include/boost/type_traits/detail/ice_and.hpp:18:72: note: #pragma message: NOTE: Use of this header (ice_and.hpp) is deprecated
 # pragma message("NOTE: Use of this header (ice_and.hpp) is deprecated")
                                                                        ^
In file included from /usr/include/boost/type_traits/ice.hpp:17:0,
                 from /usr/include/boost/python/detail/def_helper.hpp:9,
                 from /usr/include/boost/python/class.hpp:29,
                 from /usr/include/boost/python.hpp:18,
                 from /home/standnail/Git/caffe-lstm/tools/caffe.cpp:2:
/usr/include/boost/type_traits/detail/ice_not.hpp:17:72: note: #pragma message: NOTE: Use of this header (ice_not.hpp) is deprecated
 # pragma message("NOTE: Use of this header (ice_not.hpp) is deprecated")
                                                                        ^
In file included from /usr/include/boost/type_traits/ice.hpp:18:0,
                 from /usr/include/boost/python/detail/def_helper.hpp:9,
                 from /usr/include/boost/python/class.hpp:29,
                 from /usr/include/boost/python.hpp:18,
                 from /home/standnail/Git/caffe-lstm/tools/caffe.cpp:2:
/usr/include/boost/type_traits/detail/ice_eq.hpp:17:71: note: #pragma message: NOTE: Use of this header (ice_eq.hpp) is deprecated
 # pragma message("NOTE: Use of this header (ice_eq.hpp) is deprecated")
                                                                       ^
[ 86%] Linking CXX executable caffe
../lib/libcaffe.so: undefined reference to `H5LTget_dataset_ndims'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_int'
../lib/libcaffe.so: undefined reference to `H5LTfind_dataset'
../lib/libcaffe.so: undefined reference to `caffe::BlockingQueue<caffe::Batch<double>*>::pop(std::string const&)'
../lib/libcaffe.so: undefined reference to `google::base::CheckOpMessageBuilder::NewString()'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_double'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_int'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_float'
../lib/libcaffe.so: undefined reference to `google::protobuf::internal::NameOfEnum(google::protobuf::EnumDescriptor const*, int)'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_float'
../lib/libcaffe.so: undefined reference to `H5LTget_dataset_info'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_string'
../lib/libcaffe.so: undefined reference to `caffe::BlockingQueue<caffe::Batch<float>*>::pop(std::string const&)'
../lib/libcaffe.so: undefined reference to `H5LTread_dataset_double'
../lib/libcaffe.so: undefined reference to `H5LTmake_dataset_string'
collect2: error: ld returned 1 exit status
make[2]: *** [tools/CMakeFiles/caffe.bin.dir/build.make:127: tools/caffe] Error 1
make[1]: *** [CMakeFiles/Makefile2:472: tools/CMakeFiles/caffe.bin.dir/all] Error 2
make: *** [Makefile:128: all] Error 2

I set blas to openblas like cmake -DBLAS=open ../caffe-lstm/ and fixed in make.config file BLAS := open
Doing it on archlinux
any solution?

Variable-sized input sequence

Hello there,
I have variable-sized sequences as an input and each token in the sequence is encoded in a D dim vector.
To store the data efficiently they are zero padded, that is, each sequence is stored as a DxM matrix, for
the first sequence, only the first m1 columns are non-zeros, and for the second sequence only the first
m2 columns are non-zeros, and so on.
I was wondering how to feed this kind of input into the network, so the padded part is ignored as it
should be, perhaps by setting "clip" somehow?
Many thanks for your help.
CC

why the batch_size is a parameter of the layer

I was tying to use your data layer to write an auto text generator. But then I realized it is impossible to do so because I have to input data that is dividable by batch_size but I can only input one by one when testing the layer.
At training time, I know the whole text so I can have a batch size bigger than one. For example, for the word hello, my data are [none h e l l] and my labels are [h e l l o]
But at test time. For example, I give the net 0(for none) at the beginning of the testing, ideally, the net predicts h,then I use h as input .The input is produced by the net so I can't have a batch size bigger than one at test time. And your implenmentation doesn't allow us to change batch size.
Am I right?

How to build many to one LSTM

Hi, I am new to lstm. I find that lstm layer will output T * num_output for T sequence, However, I want to build a many to one model, what should I do?

Is caffe command line available?

I have tried caffe command line to train some model with this project. However I found the result so strange. Is caffe command line available for LSTM layers?

How do I provide 4096-dimensional feature vector to LSTM to get a sentence?

I am trying to generate a brief description of input images in English. For that purpose, I am using CNN and LSTM. So far, I am done with CNN module, in which I get a 4096-dimensional vector as the output of fc7 of my caffemodel (VGG net of 16 layers). Also, if I add the SoftMax layer on top of CNN, I am able to get class labels as follows:

e.g. an image of a person with mobile in his hand sitting on a bed.
So far, I get class labels like 'person', 'mobile', 'bed'.

Now, I wish to generate a sentence from these words or by using the 4096-feature vector I get as CNN output.

How to do Many-to-One LSTM learning?

I want to feed a sequence of 10 vectors into the LSTM and for those 10 inputs I have 1 output label. I don't have (and don't want to have) labels for intermediate results from LSTM (as such, I just want 1 final output from LSTM for 10 input vectors). Illustrated below:

Input vector --> Label
vector_1 --> n/a
vector_2 --> n/a
vector_3 --> n/a
.
.
.
vector_10 --> final_label

Is it possible to do something like this using this implementation of LSTM's ? If so, can you please just give a starting point ?

how N batch update the weight_i and weight_h

Hi, I'd like to know if we do the lstm in N batch and obtain the diff from them, how to update the weight for that layer? thanks.

Suggestions for implementing CWRNN in Caffe with c++

Very nice code and examples.
Thanks for sharing.
Do you have any hint/suggestion on how to implement a Clockwork RNN in Caffe with C++?
Thanks

Multi-inputs to one output

Hey guys, now I do the action recognition with this caffe-lstm. I want to input multi frames and output one. How to do that? And what should I do in the prototxt? Thank you

lstm_layer.cu:172] Check failed: error == cudaSuccess (9 vs. 0) invalid configuration argument

Hello I am implementing LSTM over C3D features and getting the following error
lstm_layer.cu:172] Check failed: error == cudaSuccess (9 vs. 0) invalid configuration argument

Could you please help me out with resolving the same. I am working with Cuda 8, cuDNN 5.0.5 with Quadro K2200 GPU

Use of default clip markers as [0,1,1, 1...,1]

This seems like the right place to get answers to Caffe LSTM questions :-). You can count on an answer.

I'm comparing the implementation of the LSTM layer over here and the (official merged) one in Caffe. They are different.

Are they conceptually the same relative to the clip_marker implementation?

My question is, if the sequence lengths are the same in the input (i.e., they don't vary) and they match the number of time-steps, then do we need to provide the clip_marker input (in the official caffe version)?

Can the network assume it to be [0,1,1, 1...,1]?

My reason to ask this is to debug the network. My own markers may be in error and likely confusing the network?

Thank you.

Using LSTM for text generation in caffe

Hi!

I am trying to train a network that generate some poems. I can not find a good and simple example for using lstm for text generation in caffe. my first question is how I must send data to the input layer? i want to use HDF5 data format for input, I think every record has some words as input words, an array of clips and a label that is the next word following input words. in start of every poem clips would be reset to zeros. am I correct?? now my HDF5 file should have 3 array?? and after this step how I should write .prototxt files and define layers?? can you explain these to me or give me an example codes??

Thanks, in advance!

'clipping_threshold' LSTM parameter same at 'clip_gradient' Caffe parameter?

@junhyukoh

I'm porting your simple LSTM example to the Caffe mainline tree. As expected some keywords and parameters are different as the implementations were independently developed.

My question is about the clipping_threshold parameter.
In your lstm implementation, I see (in the backward lstm computation):

      // Clip deriviates before nonlinearity
      if (clipping_threshold_ > Dtype(0.)) {
        caffe_bound(4*H_, pre_gate_diff_t, -clipping_threshold_,
            clipping_threshold_, pre_gate_diff_t);
      }

I don't see this in the caffe mainline code. Here, the clip_gradient is converted into a scale_factor.

Dtype scale_factor = clip_gradients / l2norm_diff;

Is it the same parameter? Does it have the same effect? Is one the scaled version of the other?

Could you help with your insight?

Thank you
Auro

Mini-batch support

As you mentioned in caffe official repo, your implementation doesn't support mini-batch. What is your plan to extend your implementation? To support N-sized truncated BPTT with M minibatch, is introducing M sequential data layers good enough?

Question about 'lr_mult'

After reading the sample in http://caffe.berkeleyvision.org/tutorial/layers.html, I know 'lr_mult' is learning rate multipliers for the weights or the biases. But there are 3 'lr_mult' in deep_lstm_short.prototxt, what does the third lr_mult mean?
I am still a novice in caffe, sorry to disturb you.