hycis / bidirectional_rnn Goto Github PK
View Code? Open in Web Editor NEWbidirectional lstm
License: MIT License
bidirectional lstm
License: MIT License
Hi, I'm writing a collection of models for Keras and I would like to use your birnn code as an starting point. Could you please add a license to your repo before I use it? Thanks!
As mentioned, it can be used to develop model for deep speech.
What can be done to input the training data?
I get the following error when I run the IMDB example:
Traceback (most recent call last):
File "imdb_birnn.py", line 77, in
model.add(BatchNormalization((24 * maxseqlen,)))
File "/home/dxquang/anaconda/lib/python2.7/site-packages/keras/layers/containers.py", line 40, in add
layer.init_updates()
File "/home/dxquang/anaconda/lib/python2.7/site-packages/keras/layers/normalization.py", line 38, in init_updates
X = self.get_input(train=True)
File "/home/dxquang/anaconda/lib/python2.7/site-packages/keras/layers/core.py", line 43, in get_input
return self.previous.get_output(train=train)
File "/home/dxquang/anaconda/lib/python2.7/site-packages/keras/layers/core.py", line 296, in get_output
X = self.get_input(train)
File "/home/dxquang/anaconda/lib/python2.7/site-packages/keras/layers/core.py", line 43, in get_input
return self.previous.get_output(train=train)
File "/home/dxquang/bidirectional_RNN/birnn.py", line 187, in get_output
forward = self.get_forward_output(train)
File "/home/dxquang/bidirectional_RNN/birnn.py", line 143, in get_forward_output
X = X.dimshuffle((1,0,2))
File "/home/dxquang/anaconda/lib/python2.7/site-packages/theano/tensor/var.py", line 341, in dimshuffle
pattern)
File "/home/dxquang/anaconda/lib/python2.7/site-packages/theano/tensor/elemwise.py", line 141, in init
(i, j, len(input_broadcastable)))
ValueError: new_order[2] is 2, but the input only has 2 axes.
It looks like there may be a problem with the load_weights method when trying to re-load the bi-directional model after a previous training session.
File "D:/Final project/TimeNLPTest/KerasVec2Vec.py", line 130, in buildBiLSTM
model.load_weights(dir + "StackedLSTM E-D 13 epochs 50 512 20 relu mse adam 2015-08-23 10.55.41.621000 .wt")
File "C:\Anaconda\lib\site-packages\keras\models.py", line 492, in load_weights
for k in range(f.attrs['nb_layers']):
File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (D:\Build\h5py\h5py-2.5.x\h5py\_objects.c:2475)
File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (D:\Build\h5py\h5py-2.5.x\h5py\_objects.c:2432)
File "C:\Anaconda\lib\site-packages\h5py\_hl\attrs.py", line 52, in __getitem__
attr = h5a.open(self._id, self._e(name))
File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (D:\Build\h5py\h5py-2.5.x\h5py\_objects.c:2475)
File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (D:\Build\h5py\h5py-2.5.x\h5py\_objects.c:2432)
File "h5py\h5a.pyx", line 77, in h5py.h5a.open (D:\Build\h5py\h5py-2.5.x\h5py\h5a.c:2079)
KeyError: "Can't open attribute (Can't locate attribute: 'nb_layers')"
I can try and work with you to sort this out, what do you think the problem could be?
I am trying to implement language model kind of thing using bidirectional LSTM
In unidirectional LSTM, when we prepare data, if sentence is:
"A recurrent neural network is a class of artificial neural network"
Then we prepare training data as follows:
x1 = A(x11) recurrent(x12)
y1 = neural
x2 = recurrent(x21) neural(22)
y2 = network
and so on....
But in bidirectional LSTM, we must have different inputs for forward and backward pass for same output, like:
xf1 = A(xf11) recurrent(xf12)
xb1 = network(xb11) is(xb12)
y1 = neural
How to specify this input for bidirectional LSTM here. Please help
Nice work, this looks very promising.
I was thinking of modifying it for something I'm doing since it doesn't seem to implement return_sequences so that you can return a whole sequence of outputs? Or am I missing something?
If you want to give me a head start, how do you imagine modifying this to return a sequence of outputs? :)
thanks for your keras blstm layer implementation; has been very helpful. I am having issues with sizing my inputs for training the blstm. I create my stack of layers with:
NUM_FEATURES = 64
MAX_LEN = 16
NUM_CLASSES = 2
model = Sequential()
# model.add(Transform((NUM_FEATURES,)))
model.add(BiDirectionLSTM(NUM_FEATURES, 128, output_mode='sum'))
model.add(BiDirectionLSTM(128,128, output_mode='sum'))
model.add(Dense(128,1,activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', class_mode="binary")
model.fit(X_train, y_train, batch_size=20, nb_epoch=3, validation_data=(X_valid, y_valid))
And while the model is able to compile, it throws an error when attempting to fit:
TypeError: ('Bad input argument to theano function with name "build/bdist.macosx-10.5-x86_64/egg/keras/models.py:127" at index 1(0-based)',
'Wrong number of dimensions: expected 3, got 2 with shape (20, 1).')
the sizes of my training and validation data are:
5766 train sequences
3072 test sequences
X_train shape: (5766, 16, 64)
X_valid shape: (1441, 16, 64)
I know you added the Transform method to help size inputs appropriately for the blstm. However, since my input data is already 3D and since I have no mlp before the blstms, i figured there was no need to transform the input?
When I run imdb_birnn.py first time,
I got the error as below.
Isn't Transform() working?
$ python imdb_birnn.py
Using gpu device 0: GeForce GTX 770
Loading data...
20000 train sequences
5000 test sequences
Pad sequences (samples x time)
train_X shape: (20000, 100)
test_X shape: (5000, 100)
Build model...
Traceback (most recent call last):
File "imdb_birnn.py", line 64, in <module>
model.add(Dense(word_vec_len, 100, activation='relu'))
File "/usr/local/lib/python2.7/dist-packages/keras/layers/containers.py", line 37, in add
self.layers[-1].set_previous(self.layers[-2])
File "/usr/local/lib/python2.7/dist-packages/keras/layers/core.py", line 33, in set_previous
str(self.input_ndim) + " but previous layer has output_shape " + str(layer.output_shape)
AssertionError: Incompatible shapes: layer expected input with ndim=2 but previous layer has output_shape (None, None, 256)
Hello. My sequences are of varying length so many of them are 0 padded. From what I understand the correct way to treat 0 padded sequences is with mask_zero=True on an embedding layer.
text_model = Sequential()
text_model.add(Embedding(max_features, 64, mask_zero=True))
text_model.add(BiDirectionLSTM(64, 64, return_sequences=True))
The above code raises the following exception:
"Exception: Cannot connect non-masking layer to layer with masked output"
Do you have any plans to support masking? It seems like an important feature for dealing with sequences.
Hi,
Latest version of keras has no support for theano utils, is there any alternative for this:
Traceback (most recent call last):
File "cnn_blstm.py", line 28, in
from birnn import BiDirectionLSTM, Transform
File "/home/khanma0b/keras-master/deepfunc-master/birnn.py", line 5, in
from keras.utils.theano_utils import shared_zeros, alloc_zeros_matrix
ImportError: No module named theano_utils
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.