hycis / bidirectional_rnn Goto Github PK

View Code? Open in Web Editor NEW

153.0 153.0 52.0 102 KB

bidirectional lstm

License: MIT License

Python 100.00%

bidirectional_rnn's People

Contributors

Stargazers

Watchers

bidirectional_rnn's Issues

license

Hi, I'm writing a collection of models for Keras and I would like to use your birnn code as an starting point. Could you please add a license to your repo before I use it? Thanks!

Modifications needed for deep speech?

As mentioned, it can be used to develop model for deep speech.
What can be done to input the training data?

Error when running

I get the following error when I run the IMDB example:

Traceback (most recent call last):
File "imdb_birnn.py", line 77, in
model.add(BatchNormalization((24 * maxseqlen,)))
File "/home/dxquang/anaconda/lib/python2.7/site-packages/keras/layers/containers.py", line 40, in add
layer.init_updates()
File "/home/dxquang/anaconda/lib/python2.7/site-packages/keras/layers/normalization.py", line 38, in init_updates
X = self.get_input(train=True)
File "/home/dxquang/anaconda/lib/python2.7/site-packages/keras/layers/core.py", line 43, in get_input
return self.previous.get_output(train=train)
File "/home/dxquang/anaconda/lib/python2.7/site-packages/keras/layers/core.py", line 296, in get_output
X = self.get_input(train)
File "/home/dxquang/anaconda/lib/python2.7/site-packages/keras/layers/core.py", line 43, in get_input
return self.previous.get_output(train=train)
File "/home/dxquang/bidirectional_RNN/birnn.py", line 187, in get_output
forward = self.get_forward_output(train)
File "/home/dxquang/bidirectional_RNN/birnn.py", line 143, in get_forward_output
X = X.dimshuffle((1,0,2))
File "/home/dxquang/anaconda/lib/python2.7/site-packages/theano/tensor/var.py", line 341, in dimshuffle
pattern)
File "/home/dxquang/anaconda/lib/python2.7/site-packages/theano/tensor/elemwise.py", line 141, in init
(i, j, len(input_broadcastable)))
ValueError: new_order[2] is 2, but the input only has 2 axes.

Load_weights not working

It looks like there may be a problem with the load_weights method when trying to re-load the bi-directional model after a previous training session.

  File "D:/Final project/TimeNLPTest/KerasVec2Vec.py", line 130, in buildBiLSTM
    model.load_weights(dir + "StackedLSTM E-D 13 epochs 50 512 20 relu mse adam 2015-08-23 10.55.41.621000 .wt")
  File "C:\Anaconda\lib\site-packages\keras\models.py", line 492, in load_weights
    for k in range(f.attrs['nb_layers']):
  File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (D:\Build\h5py\h5py-2.5.x\h5py\_objects.c:2475)
  File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (D:\Build\h5py\h5py-2.5.x\h5py\_objects.c:2432)
  File "C:\Anaconda\lib\site-packages\h5py\_hl\attrs.py", line 52, in __getitem__
    attr = h5a.open(self._id, self._e(name))
  File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (D:\Build\h5py\h5py-2.5.x\h5py\_objects.c:2475)
  File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (D:\Build\h5py\h5py-2.5.x\h5py\_objects.c:2432)
  File "h5py\h5a.pyx", line 77, in h5py.h5a.open (D:\Build\h5py\h5py-2.5.x\h5py\h5a.c:2079)
KeyError: "Can't open attribute (Can't locate attribute: 'nb_layers')"

I can try and work with you to sort this out, what do you think the problem could be?

doubt regarding preperation of training data

I am trying to implement language model kind of thing using bidirectional LSTM

In unidirectional LSTM, when we prepare data, if sentence is:
"A recurrent neural network is a class of artificial neural network"
Then we prepare training data as follows:
x1 = A(x11) recurrent(x12)
y1 = neural
x2 = recurrent(x21) neural(22)
y2 = network
and so on....

But in bidirectional LSTM, we must have different inputs for forward and backward pass for same output, like:
xf1 = A(xf11) recurrent(xf12)
xb1 = network(xb11) is(xb12)
y1 = neural

How to specify this input for bidirectional LSTM here. Please help

any example or user guide, document?

Return_sequences?

Nice work, this looks very promising.

I was thinking of modifying it for something I'm doing since it doesn't seem to implement return_sequences so that you can return a whole sequence of outputs? Or am I missing something?

If you want to give me a head start, how do you imagine modifying this to return a sequence of outputs? :)

Input sizing for BLSTM

thanks for your keras blstm layer implementation; has been very helpful. I am having issues with sizing my inputs for training the blstm. I create my stack of layers with:

NUM_FEATURES  = 64   
MAX_LEN       = 16      
NUM_CLASSES   = 2   

model = Sequential()
# model.add(Transform((NUM_FEATURES,)))
model.add(BiDirectionLSTM(NUM_FEATURES, 128, output_mode='sum'))
model.add(BiDirectionLSTM(128,128, output_mode='sum'))
model.add(Dense(128,1,activation='sigmoid'))

model.compile(loss='binary_crossentropy', optimizer='adam', class_mode="binary")
model.fit(X_train, y_train, batch_size=20, nb_epoch=3, validation_data=(X_valid, y_valid))

And while the model is able to compile, it throws an error when attempting to fit:

TypeError: ('Bad input argument to theano function with name "build/bdist.macosx-10.5-x86_64/egg/keras/models.py:127"  at index 1(0-based)', 
'Wrong number of dimensions: expected 3, got 2 with shape (20, 1).')

the sizes of my training and validation data are:

5766 train sequences
3072 test sequences
X_train shape: (5766, 16, 64)
X_valid shape: (1441, 16, 64)

I know you added the Transform method to help size inputs appropriately for the blstm. However, since my input data is already 3D and since I have no mlp before the blstms, i figured there was no need to transform the input?

imdb_birnn.py example error

When I run imdb_birnn.py first time,
I got the error as below.
Isn't Transform() working?

$ python imdb_birnn.py
Using gpu device 0: GeForce GTX 770
Loading data...
20000 train sequences
5000 test sequences
Pad sequences (samples x time)
train_X shape: (20000, 100)
test_X shape: (5000, 100)
Build model...
Traceback (most recent call last):
  File "imdb_birnn.py", line 64, in <module>
    model.add(Dense(word_vec_len, 100, activation='relu'))
  File "/usr/local/lib/python2.7/dist-packages/keras/layers/containers.py", line 37, in add
    self.layers[-1].set_previous(self.layers[-2])
  File "/usr/local/lib/python2.7/dist-packages/keras/layers/core.py", line 33, in set_previous
    str(self.input_ndim) + " but previous layer has output_shape " + str(layer.output_shape)
AssertionError: Incompatible shapes: layer expected input with ndim=2 but previous layer has output_shape (None, None, 256)

Masking?

Hello. My sequences are of varying length so many of them are 0 padded. From what I understand the correct way to treat 0 padded sequences is with mask_zero=True on an embedding layer.

text_model = Sequential()
text_model.add(Embedding(max_features, 64, mask_zero=True))
text_model.add(BiDirectionLSTM(64, 64, return_sequences=True))

The above code raises the following exception:
"Exception: Cannot connect non-masking layer to layer with masked output"

Do you have any plans to support masking? It seems like an important feature for dealing with sequences.

Issue with theano utils

Hi,
Latest version of keras has no support for theano utils, is there any alternative for this:
Traceback (most recent call last):
File "cnn_blstm.py", line 28, in
from birnn import BiDirectionLSTM, Transform
File "/home/khanma0b/keras-master/deepfunc-master/birnn.py", line 5, in
from keras.utils.theano_utils import shared_zeros, alloc_zeros_matrix
ImportError: No module named theano_utils

hycis / bidirectional_rnn Goto Github PK

bidirectional_rnn's People

Contributors

Stargazers

Watchers

Forkers

bidirectional_rnn's Issues

license

Modifications needed for deep speech?

Error when running

Load_weights not working

doubt regarding preperation of training data

any example or user guide, document?

Return_sequences?

Input sizing for BLSTM

imdb_birnn.py example error

Masking?

Issue with theano utils

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent