nicolas-ivanov / debug_seq2seq Goto Github PK

View Code? Open in Web Editor NEW

233.0 23.0 87.0 6.31 MB

[unmaintained] Make seq2seq for keras work

Shell 2.57% Python 97.43%

seq2seq chatbot keras

debug_seq2seq's Introduction

debug seq2seq

Note: the repository is not maintained. Feel free to PM me if you'd like to take up the maintainance.

Make seq2seq for keras work. And also give a try to some other implementations of seq2seq.

The code includes:

small dataset of movie scripts to train your models on
preprocessor function to properly tokenize the data
word2vec helpers to make use of gensim word2vec lib for extra flexibility
train and predict function to harness the power of seq2seq

Warning

The code has bugs, undoubtedly. Feel free to fix them and pull-request.
No good results were achieved with this architecture yet. See 'Results' section below for details.

Papers

Nice picture

Setup&Run

git clone https://github.com/nicolas-ivanov/debug_seq2seq
cd debug_seq2seq
bash bin/setup.sh
python bin/train.py

and then

python bin/test.py

Results

No good results were achieved so far:

[why ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
[who ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
[yeah ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
[what is it ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as as i]
[why not ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
[really ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]

My guess is that there are some foundational problems in this approach:

Since word2vec vectors are used for words representations and the model returns an approximate vector for every next word, this error is accumulated from one word to another and thus starting from the third word the model fails to predict anything meaningful... This problem might be overcome if we replace our approximate word2vec vector every thimestamp with a "correct" vector, i.e. the one that corresponds to an actual word from the dictionary. Does it make sence? However you need to dig into seq2seq code to do that.
The second problem relates to word sampling: even if you manage to solve the aforementioned issue, in case you stick to using argmax() for picking the most probable word every time stamps, the answers gonna be too simple and not interesting, like:

are you a human?			-- no .
are you a robot or human?	-- no .
are you a robot?			-- no .
are you better than siri?  		-- yes .
are you here ?				-- yes .
are you human?			-- no .
are you really better than siri?	-- yes .
are you there 				-- you ' re not going to be
are you there?!?!			-- yes .

Not to mislead you: these results were achieved on a different seq2seq architecture, based on tensorflow.

Sampling with temperature could be used in order to diversify the output results, however that's again should be done inside seq2seq library.

debug_seq2seq's People

Contributors

Stargazers

Watchers

Forkers

viksit nimishzynga laysakura hydercps lifeng1989 nextdawn janggwan talentlei jaimita-bansal shogo-ma cojito einsnull eshijia leimiaomiao ayushi113 barneyeldinosaurio vunb stevenlol eyal-str kentchun33333 brandyaptx tokey66363 icewwn shuow fayimora goliasz markstoehr lsq357 superyangwenwen careunion saikswaroop vyraun shaktisd jianbotang marlboro233544951 douglaswu zz412000428 bityangke nguyenquochung-k58ca zhenfengcao jithsjoy sun-peach chubbymaggie dunkanibrahim sevinjyolchuyeva vasantivmahajan mvpduncan coderxdy fanhuaandluomu dilip-dmk libardo1 jadielam evisoft bashkirtsevich colinsongf frannetty sxdkxgwan yuyichen09 174high slbinilkumar scsherm harshadeepg huibinr zuoshaobo lagleki lizsz mageed xrick shubhampachori12110095 shku1235 ag027592 kiteflyingee apstwilly fangzheng354 sweetcard afcarl chenxingqiang surpriselee samurainote navpreetsamra summerraining varunjuneja walkinthepast nlp-ljy vovkinson subburajs gkuo06

debug_seq2seq's Issues

Nice tools for drawing network?

Hi, could you tell me the network drawing tool that draws the nice picture like that:

Process with end-of-sentence symbol

Hi guys.
I'm running Nicolas code and I have some concerns about the end-of-sentence symbol, i.e. "$$$"
What I understand is Nicolas put "$$$" at the end of each sentence. So my questions are:

If I understand correctly, In Nicolas code, seem that the "$$$" is treated similar to normal word ? That mean, word2vec model will generate a vector for "$$$" like other words. Does the symbol important, i.e. can I remove it from each sentence without affecting the performance?
If the symbol important, do you think how can I add the symbol to the speech data where each "sentence" is a sequence of feature vectors ? Do I just define a fix arbitrary vector as "$$$" and then add to the end of each "sentence" ?

Thank you in advance

RuntimeError when i'm running bin/train.py

File "/home/devkite/anaconda2/lib/python2.7/site-packages/theano/configdefaults.py", line 1252, in check_mkl_openmp
raise RuntimeError('To use MKL 2018 with Theano you MUST set "MKL_THREADING_LAYER=GNU" in your environement.')
RuntimeError: To use MKL 2018 with Theano you MUST set "MKL_THREADING_LAYER=GNU" in your environement.

How to set "MKL_THREADING_LAYER=GNU"

Upgrade to latest Seq2seq API

TypeError: can only concatenate tuple (not "list") to tuple

andy1028@andy1028-Envy:/media/andy1028/data1t/os_prj/github/debug_seq2seq$ python bin/train.py
INFO:gensim.utils:Pattern library is not installed, lemmatization won't be available.
INFO:summa.preprocessing.cleaner:'pattern' package not found; tag filters are not available for English
Using TensorFlow backend.
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
INFO:lib.dialog_processor:Loading corpus data...
INFO:lib.dialog_processor:/var/lib/try_seq2seq/corpora_processed/movie_lines_cleaned_m1.txt and /var/lib/try_seq2seq/words_index/w_idx_movie_lines_cleaned_m1.txt exist, loading files from disk
INFO:main:-----
INFO:lib.w2v_model.w2v:Loading model from /var/lib/try_seq2seq/w2v_models/movie_lines_cleaned_w5_m1_v256.bin
INFO:gensim.utils:loading Word2Vec object from /var/lib/try_seq2seq/w2v_models/movie_lines_cleaned_w5_m1_v256.bin
INFO:gensim.utils:setting ignored attribute syn0norm to None
INFO:gensim.utils:setting ignored attribute cum_table to None
INFO:lib.w2v_model.w2v:Model "movie_lines_cleaned_w5_m1_v256.bin" has been loaded.
INFO:main:-----
INFO:lib.nn_model.model:Initializing NN model with the following params:
INFO:lib.nn_model.model:Input dimension: 256 (token vector size)
INFO:lib.nn_model.model:Hidden dimension: 512
INFO:lib.nn_model.model:Output dimension: 20001 (token dict size)
INFO:lib.nn_model.model:Input seq length: 16
INFO:lib.nn_model.model:Output seq length: 6
INFO:lib.nn_model.model:Batch size: 32
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GTX 950M
major: 5 minor: 0 memoryClockRate (GHz) 1.124
pciBusID 0000:01:00.0
Total memory: 4.00GiB
Free memory: 3.75GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:839] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 950M, pci bus id: 0000:01:00.0)
Traceback (most recent call last):
File "bin/train.py", line 37, in
learn()
File "bin/train.py", line 30, in learn
nn_model = get_nn_model(token_dict_size=len(index_to_token))
File "/media/andy1028/data1t/os_prj/github/debug_seq2seq/lib/nn_model/model.py", line 29, in get_nn_model
depth=1
File "/usr/local/lib/python2.7/dist-packages/seq2seq/models.py", line 76, in SimpleSeq2Seq
model.add(encoder)
File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 276, in add
layer.create_input_layer(batch_input_shape, input_dtype)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 370, in create_input_layer
self(x)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 514, in call
self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 572, in add_inbound_node
Node.create_node(self, inbound_layers, node_indices, tensor_indices)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 149, in create_node
output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0]))
File "build/bdist.linux-x86_64/egg/recurrentshop/engine.py", line 296, in call
File "build/bdist.linux-x86_64/egg/recurrentshop/engine.py", line 51, in
TypeError: can only concatenate tuple (not "list") to tuple
andy1028@andy1028-Envy:/media/andy1028/data1t/os_prj/github/debug_seq2seq$ git pull
Already up-to-date.
andy1028@andy1028-Envy:/media/andy1028/data1t/os_prj/github/debug_seq2seq$

SimpleSeq2Seq

When I train SimpleSeq2Seq I get the error :

optional_input_placeholder = _to_list(_OptionalInputPlaceHolder().inbound_nodes[0].output_tensors)[0]
AttributeError: '_OptionalInputPlaceHolder' object has no attribute 'inbound_nodes'

from recurrentshop . I am using Python 2.7.13

Thanks
Natan Katz

error about debug_seq2seq

Hi, @nicolas-ivanov

I run the training code.Maybe it contains some errors in it.I get the error like below:

`ValueError: Shape mismatch: x has 64 rows but z has 24 rows
Apply node that caused the error: Gemm{no_inplace}(Subtensor{::, int64::}.0, TensorConstant{0.20000000298}, <TensorType(float32, matrix)>, lstm_U_o_copy, TensorConstant{0.20000000298})
Toposort index: 5
Inputs types: [TensorType(float32, matrix), TensorType(float32, scalar), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, scalar)]
Inputs shapes: [(24, 128), (), (64, 128), (128, 128), ()]
Inputs strides: [(32768, 4), (), (512, 4), (512, 4), ()]
Inputs values: ['not shown', array(0.20000000298023224, dtype=float32), 'not shown', 'not shown', array(0.20000000298023224, dtype=float32)]
Outputs clients: [[Elemwise{Composite{(clip((i0 + i1), i2, i3) * tanh(i4))}}(TensorConstant{(1, 1) of 0.5}, Gemm{no_inplace}.0, TensorConstant{(1, 1) of 0}, TensorConstant{(1, 1) of 1}, Elemwise{Composite{((clip((i0 + i1), i2, i3) * i4) + (clip((i5 + i6), i2, i3) * tanh(i7)))}}.0)]]

HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
Apply node that caused the error: forall_inplace,cpu,scan_fn}(TensorConstant{16}, InplaceDimShuffle{1,0,2}.0, IncSubtensor{InplaceSet;:int64:}.0, DeepCopyOp.0, TensorConstant{16}, lstm_U_o, lstm_U_f, lstm_U_i, lstm_U_c)
Toposort index: 36
Inputs types: [TensorType(int64, scalar), TensorType(float32, 3D), TensorType(float32, (True, False, False)), TensorType(float32, (True, False, False)), TensorType(int64, scalar), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, matrix)]
Inputs shapes: [(), (16, 24, 512), (1, 64, 128), (1, 64, 128), (), (128, 128), (128, 128), (128, 128), (128, 128)]
Inputs strides: [(), (2048, 32768, 4), (32768, 512, 4), (32768, 512, 4), (), (512, 4), (512, 4), (512, 4), (512, 4)]
Inputs values: [array(16), 'not shown', 'not shown', 'not shown', array(16), 'not shown', 'not shown', 'not shown', 'not shown']
Outputs clients: [[], [], [InplaceDimShuffle{0,1,2}(forall_inplace,cpu,scan_fn}.2)]]

How can I solve it ?

Cannot import name initializers

I was trying to run this code on Python 2.7, Ubuntu 16.04. But it looks like i'm having some import error. This is my error's log:

"Traceback (most recent call last):
File "bin/train.py", line 10, in
from lib.nn_model.model import get_nn_model
File "/home/gmo/debug_seq2seq/lib/nn_model/model.py", line 4, in
from seq2seq.models import SimpleSeq2seq
File "build/bdist.linux-x86_64/egg/seq2seq/init.py", line 1, in
File "build/bdist.linux-x86_64/egg/seq2seq/cells.py", line 1, in
File "build/bdist.linux-x86_64/egg/recurrentshop/init.py", line 1, in
File "build/bdist.linux-x86_64/egg/recurrentshop/engine.py", line 3, in
ImportError: cannot import name initializers"

Can anyone tell me how to fix this? Thanks

Module 'utils'

Hello,

I am getting this error when trying to run the example:

ModuleNotFoundError: No module named 'utils.utils'

ImportError running bin/train.py

hi there
I am trying to start learning this code, however after i download and install everything I try to run your example code but I am getting this error, could you please point me out my mistake?
thanks a lot

$ sudo python bin/train.py
INFO:gensim.utils:'pattern' package found; utils.lemmatize() is available for English
INFO:summa.preprocessing.cleaner:'pattern' package found; tag filters are available for English
Using Theano backend.
/usr/local/lib/python2.7/dist-packages/theano/tensor/signal/downsample.py:6: UserWarning: downsample module has been moved to the theano.tensor.signal.pool module.
"downsample module has been moved to the theano.tensor.signal.pool module.")
Traceback (most recent call last):
File "bin/train.py", line 10, in
from lib.nn_model.model import get_nn_model
File "/home/creangel/Downloads/keras/seq2seq/debug_seq2seq/lib/nn_model/model.py", line 4, in
from seq2seq.models import SimpleSeq2seq
File "build/bdist.linux-x86_64/egg/seq2seq/init.py", line 1, in
File "build/bdist.linux-x86_64/egg/seq2seq/cells.py", line 1, in
File "build/bdist.linux-x86_64/egg/recurrentshop/init.py", line 1, in
File "build/bdist.linux-x86_64/egg/recurrentshop/engine.py", line 1, in
ImportError: cannot import name Layer

sanity check

I came up with the following sanity check to ensure that the implementation and word embeddings etc are good.

I created a dataset of 100,000 lines, that has the following 6 lines repeated over and over again:

hi . $$$
hi , joey . $$$
hello ? $$$
who are you ? $$$
what are you doing ? $$$
nothing much . you ? $$$

I then ran your code with the following parameters and model:

TOKEN_REPRESENTATION_SIZE = 32 # word2vec parameter
HIDDEN_LAYER_DIMENSION = 4096 # number of nodes in each LSTM layer
    seq2seq = Seq2seq(
        batch_input_shape=(SAMPLES_BATCH_SIZE, INPUT_SEQUENCE_LENGTH, TOKEN_REPRESENTATION_SIZE),
        hidden_dim = HIDDEN_LAYER_DIMENSION,
        output_length=ANSWER_MAX_TOKEN_LENGTH,
        output_dim=token_dict_size,
        depth=2,
        dropout=0.25,
        peek=True
        )

    opt=adagrad(clipvalue=50)
    model.compile(loss='sparse_categorical_crossentropy', optimizer=opt, metrics=["accuracy"])

After 10 data passes, my result look like this:

INFO:lib.nn_model.train:[hi. ] -> [$$$ doing who who $$$ $$$ $$$]
INFO:lib.nn_model.train:[hello ?] -> [$$$ doing who who $$$ $$$ $$$] 
INFO:lib.nn_model.train:[who are you ?] -> [$$$ doing who who $$$ $$$ $$$]
INFO:lib.nn_model.train:[what are you doing ?] -> [$$$ doing who who $$$ $$$ $$$]

So basically, the sanity check fails. The model can't even learn the answer to these 6 lines, even though they were repeated so many times. Does anyone know why this is happening? What could be the problem?

Got an error when I run the train.py

there is the error I got:

INFO:gensim.utils:detected Windows; aliasing chunkize to chunkize_serial
INFO:summa.preprocessing.cleaner:'pattern' package not found; tag filters are no                                                                                t available for English
INFO:lib.dialog_processor:Loading corpus data...
INFO:lib.dialog_processor:data/try_seq2seq\corpora_processed\movie_lines_cleaned_10k_m5.txt and data/try_seq2seq\words_index\w_idx_movie_lines_cleaned_10k_m5.txt exist, loading files from disk
INFO:__main__:-----
INFO:lib.w2v_model.w2v:Loading model from data/try_seq2seq\w2v_models\movie_lines_cleaned_10k_w5_m5_v128.bin
INFO:gensim.utils:loading Word2Vec object from data/try_seq2seq\w2v_models\movie_lines_cleaned_10k_w5_m5_v128.bin
INFO:gensim.utils:setting ignored attribute syn0norm to None
INFO:gensim.utils:setting ignored attribute cum_table to None
INFO:lib.w2v_model.w2v:Model "movie_lines_cleaned_10k_w5_m5_v128.bin" has been loaded.
INFO:__main__:-----
INFO:lib.nn_model.model:Initializing NN model with the following params:
INFO:lib.nn_model.model:Input dimension: 128 (token vector size)
INFO:lib.nn_model.model:Hidden dimension: 256
INFO:lib.nn_model.model:Output dimension: 1768 (token dict size)
INFO:lib.nn_model.model:Input seq length: 8
INFO:lib.nn_model.model:Output seq length: 8
INFO:lib.nn_model.model:Batch size: 64
INFO:lib.nn_model.model:Model is built
INFO:__main__:-----
INFO:lib.nn_model.train:Full-data-pass iteration num: 1
Epoch 1/1
Traceback (most recent call last):
  File "bin/train.py", line 37, in <module>
    learn()
  File "bin/train.py", line 33, in learn
    train_model(nn_model, w2v_model, dialog_lines_for_nn, index_to_token)
  File "E:\work\github\nicolas-ivanov\debug_seq2seq\lib\nn_model\train.py", line 84, in train_model
    nn_model.fit(X_train, Y_train, batch_size=TRAIN_BATCH_SIZE, nb_epoch=1, show_accuracy=True, verbose=1)
  File "build\bdist.win-amd64\egg\keras\models.py", line 489, in fit
  File "build\bdist.win-amd64\egg\keras\models.py", line 210, in _fit
  File "D:\soft\tool\Anaconda\lib\site-packages\theano-0.7.0-py2.7.egg\theano\compile\function_module.py", line 871, in __call__
    storage_map=getattr(self.fn, 'storage_map', None))
  File "D:\soft\tool\Anaconda\lib\site-packages\theano-0.7.0-py2.7.egg\theano\gof\link.py", line 314, in raise_with_op
    reraise(exc_type, exc_value, exc_trace)
  File "D:\soft\tool\Anaconda\lib\site-packages\theano-0.7.0-py2.7.egg\theano\compile\function_module.py", line 859, in __call__
    outputs = self.fn()
  File "D:\soft\tool\Anaconda\lib\site-packages\theano-0.7.0-py2.7.egg\theano\gof\op.py", line 876, in rval
    r = p(n, [x[0] for x in i], o)
  File "D:\soft\tool\Anaconda\lib\site-packages\theano-0.7.0-py2.7.egg\theano\tensor\subtensor.py", line 2160, in perform
    out[0] = inputs[0].__getitem__(inputs[1:])
IndexError: index 8 is out of bounds for axis 0 with size 8
Apply node that caused the error: AdvancedSubtensor(Subtensor{int64::}.0, Subtensor{int64}.0, Subtensor{int64}.0)
Toposort index: 486
Inputs types: [TensorType(float64, 3D), TensorType(int64, vector), TensorType(int64, vector)]
Inputs shapes: [(8L, 64L, 1768L), (512L,), (512L,)]
Inputs strides: [(905216L, 14144L, 8L), (8L,), (8L,)]
Inputs values: ['not shown', 'not shown', 'not shown']
Outputs clients: [[Shape_i{1}(AdvancedSubtensor.0), Elemwise{Sub}[(0, 1)](AdvancedSubtensor.0, AdvancedSubtensor.0)]]

Backtrace when the node is created:
  File "build\bdist.win-amd64\egg\keras\models.py", line 70, in weighted
    filtered_y_pred = y_pred[weights.nonzero()[:-1]]

Error for different input sequence length and output sequence length

I am getting the following error while prediction. The model predicts fine when the input and output sequence length is equal.

Input sequence length = 16
Output sequence length = 6
Could you help?

Traceback (most recent call last):
  File "/home/jaimita/debug_seq2seq/bin/train.py", line 39, in <module>
    learn()
  File "/home/jaimita/debug_seq2seq/bin/train.py", line 36, in learn
    train_model(nn_model, w2v_model, dialog_lines_for_nn, index_to_token)
  File "/home/jaimita/debug_seq2seq/lib/nn_model/train.py", line 96, in train_model
    log_predictions(test_sentences, nn_model, w2v_model, index_to_token)
  File "/home/jaimita/debug_seq2seq/lib/nn_model/train.py", line 21, in log_predictions
    prediction = predict_sentence(sent, nn_model, w2v_model, index_to_token)
  File "/home/jaimita/debug_seq2seq/lib/nn_model/predict.py", line 47, in predict_sentence
    tokens_sequence = _predict_sequence(input_sequence, nn_model, w2v_model, index_to_token, diversity)
  File "/home/jaimita/debug_seq2seq/lib/nn_model/predict.py", line 34, in _predict_sequence
    predictions = nn_model.predict(X, verbose=0)[0]
  File "build/bdist.linux-x86_64/egg/keras/models.py", line 661, in predict
  File "build/bdist.linux-x86_64/egg/keras/models.py", line 322, in _predict_loop
  File "build/bdist.linux-x86_64/egg/keras/backend/theano_backend.py", line 384, in __call__
  File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 871, in __call__
    storage_map=getattr(self.fn, 'storage_map', None))
  File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line 314, in raise_with_op
    reraise(exc_type, exc_value, exc_trace)
  File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 859, in __call__
    outputs = self.fn()
  File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py", line 963, in rval
    r = p(n, [x[0] for x in i], o)
  File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py", line 952, in <lambda>
    self, node)
  File "theano/scan_module/scan_perform.pyx", line 405, in theano.scan_module.scan_perform.perform (/home/jaimita/.theano/compiledir_Linux-3.19--generic-x86_64-with-Ubuntu-14.04-trusty-x86_64-2.7.6-64/scan_perform/mod.cpp:4316)
  File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line 314, in raise_with_op
    reraise(exc_type, exc_value, exc_trace)
  File "theano/scan_module/scan_perform.pyx", line 397, in theano.scan_module.scan_perform.perform (/home/jaimita/.theano/compiledir_Linux-3.19--generic-x86_64-with-Ubuntu-14.04-trusty-x86_64-2.7.6-64/scan_perform/mod.cpp:4193)
ValueError: Input dimension mis-match. (input[0].shape[0] = 6, input[1].shape[0] = 16)
Apply node that caused the error: Elemwise{Add}[(0, 1)](InplaceDimShuffle{1,0,2}.0, InplaceDimShuffle{1,0,2}.0)
Toposort index: 31
Inputs types: [TensorType(float32, 3D), TensorType(float32, 3D)]
Inputs shapes: [(6, 32, 128), (16, 32, 128)]
Inputs strides: [(16384, 512, 4), (512, 8192, 4)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[Subtensor{int64:int64:int8}(Elemwise{Add}[(0, 1)].0, ScalarFromTensor.0, ScalarFromTensor.0, Constant{1})]]

HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
Apply node that caused the error: forall_inplace,cpu,scan_fn}(TensorConstant{6}, IncSubtensor{InplaceSet;:int64:}.0, IncSubtensor{Set;:int64:}.0, IncSubtensor{InplaceSet;:int64:}.0, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, vector)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, InplaceDimShuffle{1,0,2}.0, InplaceDimShuffle{x,0}.0, InplaceDimShuffle{x,0}.0, InplaceDimShuffle{x,0}.0, InplaceDimShuffle{x,0}.0, InplaceDimShuffle{x,0}.0, InplaceDimShuffle{x,0}.0)
Toposort index: 384
Inputs types: [TensorType(int8, scalar), TensorType(float32, 3D), TensorType(float32, (True, False, False)), TensorType(float32, (True, False, False)), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, vector), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, matrix), TensorType(float32, 3D), TensorType(float32, row), TensorType(float32, row), TensorType(float32, row), TensorType(float32, row), TensorType(float32, row), TensorType(float32, row)]
Inputs shapes: [(), (6, 32, 128), (1, 32, 128), (1, 32, 128), (128, 128), (128, 1), (1,), (128, 128), (128, 128), (128, 128), (128, 128), (128, 128), (128, 128), (128, 128), (128, 128), (128, 128), (128, 128), (128, 128), (128, 128), (32, 6, 128), (1, 128), (1, 1), (1, 128), (1, 128), (1, 128), (1, 128)]
Inputs strides: [(), (16384, 512, 4), (16384, 512, 4), (16384, 512, 4), (512, 4), (4, 4), (4,), (512, 4), (512, 4), (512, 4), (512, 4), (512, 4), (512, 4), (512, 4), (512, 4), (512, 4), (512, 4), (512, 4), (512, 4), (512, 16384, 4), (512, 4), (4, 4), (512, 4), (512, 4), (512, 4), (512, 4)]
Inputs values: [array(6, dtype=int8), 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', array([ -2.43138842e-14], dtype=float32), 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', 'not shown', array([[ -2.43138842e-14]], dtype=float32), 'not shown', 'not shown', 'not shown', 'not shown']
Outputs clients: [[forall_inplace,cpu,scan_fn}(TensorConstant{6}, forall_inplace,cpu,scan_fn}.0, Alloc.0, IncSubtensor{InplaceSet;:int64:}.0, IncSubtensor{Set;:int64:}.0, IncSubtensor{InplaceSet;:int64:}.0, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, <TensorType(float32, matrix)>, InplaceDimShuffle{x,0}.0, InplaceDimShuffle{x,0}.0, InplaceDimShuffle{x,0}.0, InplaceDimShuffle{x,0}.0)], [], []]

HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

Process finished with exit code 1

Why does not function 'save_model(nn_model)' work?

I noticed the function 'save_model(nn_model)' in lib/nn_model/train.py doesn't work so far. It seems there is an example to save/load model using HDF5 and json:

json_string = model.to_json()
open('my_model_architecture.json', 'w').write(json_string)
model.save_weights('my_model_weights.h5')
(elsewhere...)
model = model_from_json(open('my_model_architecture.json').read())
model.load_weights('my_model_weights.h5')

Does the method work?

ValueError: numpy.dtype has the wrong size, try recompiling

I get the following error when running bin/train.py

1482-sraval:debug_seq2seq sraval$ python bin/train.py 
Traceback (most recent call last):
  File "bin/train.py", line 9, in <module>
    from lib.w2v_model import w2v
  File "/Users/sraval/Desktop/yooo/debug_seq2seq/lib/w2v_model/w2v.py", line 4, in <module>
    from gensim.models import Word2Vec
  File "/Library/Python/2.7/site-packages/gensim/__init__.py", line 6, in <module>
    from gensim import parsing, matutils, interfaces, corpora, models, similarities, summarization
  File "/Library/Python/2.7/site-packages/gensim/models/__init__.py", line 13, in <module>
    from .word2vec import Word2Vec
  File "/Library/Python/2.7/site-packages/gensim/models/word2vec.py", line 100, in <module>
    from gensim.models.word2vec_inner import train_sentence_sg, train_sentence_cbow, FAST_VERSION,\
  File "__init__.pxd", line 155, in init gensim.models.word2vec_inner (./gensim/models/word2vec_inner.c:9234)
ValueError: numpy.dtype has the wrong size, try recompiling

ImportError: cannot import name weight

Hi,
I am getting this error when I call python train.py,

Traceback (most recent call last):
File "train.py", line 10, in
from lib.nn_model.model import get_nn_model
File "/home/afo214/tensorflow/vrp/seqTOseq/debug_seq2seq/lib/nn_model/model.py", line 4, in
from seq2seq.models import SimpleSeq2seq
File "build/bdist.linux-x86_64/egg/seq2seq/init.py", line 1, in
File "build/bdist.linux-x86_64/egg/seq2seq/cells.py", line 1, in
ImportError: cannot import name weight

Any idea?

ImportError: No module named gensim.models

when I try to run train.py, the error occurs:
ljy@ubuntu:~/debug_seq2seq$ python bin/train.py
Traceback (most recent call last):
File "bin/train.py", line 9, in
from lib.w2v_model import w2v
File "/home/ljy/debug_seq2seq/lib/w2v_model/w2v.py", line 4, in
from gensim.models import Word2Vec
ImportError: No module named gensim.models

But I has just install gensim by using
sudo easy_install -U gensim
Anyone could help me ?

meaningful result?

Hi nicolas,
first really thanks for your work. when I run your code, I cannot get meaningful results, all I got is like

NFO:lib.nn_model.train:[why ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
INFO:lib.nn_model.train:[who ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
INFO:lib.nn_model.train:[yeah ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
INFO:lib.nn_model.train:[what is it ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as as i]
INFO:lib.nn_model.train:[why not ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
INFO:lib.nn_model.train:[really ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
INFO:lib.nn_model.train:[huh ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
INFO:lib.nn_model.train:[yes ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
INFO:lib.nn_model.train:[what ' s that ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as as i]
INFO:lib.nn_model.train:[what are you doing ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as as i]
INFO:lib.nn_model.train:[what are you talking about ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as as i]
INFO:lib.nn_model.train:[what happened ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as as i]
INFO:lib.nn_model.train:[hello ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
INFO:lib.nn_model.train:[where ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
INFO:lib.nn_model.train:[how ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
INFO:lib.nn_model.train:[excuse me ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as i i]
INFO:lib.nn_model.train:[who are you ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as as i]
INFO:lib.nn_model.train:[what do you want ?] -> [i ' . . $$$ . $$$ $$$ $$$ $$$ as as as as as i]
INFO:lib.nn_model.train:[what ' s wrong ?] -> [i ' . . $$$ .

NFO:lib.nn_model.train:[what are you talking about ?] -> [i ' . . . . . . . . , , , , , ,]
INFO:lib.nn_model.train:[what happened ?] -> [i ' . . . . . . . . , , , , , ,]
INFO:lib.nn_model.train:[hello ?] -> [i ' . . . . . . . . , , , , , ,]
INFO:lib.nn_model.train:[where ?] -> [i ' . . . . . . . . , , , , , ,]
INFO:lib.nn_model.train:[how ?] -> [i ' . . . . . . . . , , , , , ,]
INFO:lib.nn_model.train:[excuse me ?] -> [i ' . . . . . . . . , , , , , ,]
INFO:lib.nn_model.train:[who are you ?] -> [i ' . . . . . . . . , , , , , ,]

could you sharing your opinion with me? really appreciate

This repository is so old

I am very happy to find this repository that shows how to train a seq2seq model.

But this repository maybe a little old, many of its dependency packages should be updated:

import SimpleSeq2seq should be import SimpleSeq2Seq,
The genism version 0.12.1 is too old to install, and the new version of genism has some different syntax, etc.

So could the author can spend a little time to update this repository? I will appreciate if the author could update the repository.

I found a small bug

File: lib/nn_model/train.py
Line: 68
def save_model(nn_model):
#model_full_path = os.path.join(DATA_PATH, 'nn_models', NN_MODEL_PATH)
nn_model.save_weights(NN_MODEL_PATH, overwrite=True)

Unable to install gensim in anaconda

I am getting the error
Could not install packages due to an EnvironmentError: [Errno 21] Is a directory: '/home/ram/.local/lib/python3.7/site-packages/pip-19.0.1.dist-info/METADATA'

ImportError: cannot import name SimpleSeq2seq

I had the done:

andy1028@andy1028-Envy:/media/andy1028/data1t/os_prj/github/debug_seq2seq/seq2seq$ git remote -v
origin https://github.com/farizrahman4u/seq2seq.git (fetch)
origin https://github.com/farizrahman4u/seq2seq.git (push)
andy1028@andy1028-Envy:/media/andy1028/data1t/os_prj/github/debug_seq2seq/seq2seq$ sudo python3 setup.py install
running install
running bdist_egg
running egg_info
writing seq2seq.egg-info/PKG-INFO
writing requirements to seq2seq.egg-info/requires.txt
writing dependency_links to seq2seq.egg-info/dependency_links.txt
writing top-level names to seq2seq.egg-info/top_level.txt
reading manifest file 'seq2seq.egg-info/SOURCES.txt'
writing manifest file 'seq2seq.egg-info/SOURCES.txt'
.....

TypeError: int returned non-int (type NoneType)

andy1028@andy1028-Envy:/media/andy1028/data1t/os_prj/github/debug_seq2seq$ python bin/train.py
INFO:gensim.utils:Pattern library is not installed, lemmatization won't be available.
INFO:summa.preprocessing.cleaner:'pattern' package not found; tag filters are not available for English
Using TensorFlow backend.
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
INFO:lib.dialog_processor:Loading corpus data...
INFO:lib.dialog_processor:/var/lib/try_seq2seq/corpora_processed/movie_lines_cleaned_m1.txt and /var/lib/try_seq2seq/words_index/w_idx_movie_lines_cleaned_m1.txt exist, loading files from disk
INFO:main:-----
INFO:lib.w2v_model.w2v:Loading model from /var/lib/try_seq2seq/w2v_models/movie_lines_cleaned_w5_m1_v256.bin
INFO:gensim.utils:loading Word2Vec object from /var/lib/try_seq2seq/w2v_models/movie_lines_cleaned_w5_m1_v256.bin
INFO:gensim.utils:setting ignored attribute syn0norm to None
INFO:gensim.utils:setting ignored attribute cum_table to None
INFO:lib.w2v_model.w2v:Model "movie_lines_cleaned_w5_m1_v256.bin" has been loaded.
INFO:main:-----
INFO:lib.nn_model.model:Initializing NN model with the following params:
INFO:lib.nn_model.model:Input dimension: 256 (token vector size)
INFO:lib.nn_model.model:Hidden dimension: 512
INFO:lib.nn_model.model:Output dimension: 20001 (token dict size)
INFO:lib.nn_model.model:Input seq length: 16
INFO:lib.nn_model.model:Output seq length: 6
INFO:lib.nn_model.model:Batch size: 32
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GTX 950M
major: 5 minor: 0 memoryClockRate (GHz) 1.124
pciBusID 0000:01:00.0
Total memory: 4.00GiB
Free memory: 3.68GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:839] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 950M, pci bus id: 0000:01:00.0)
Traceback (most recent call last):
File "bin/train.py", line 37, in
learn()
File "bin/train.py", line 30, in learn
nn_model = get_nn_model(token_dict_size=len(index_to_token))
File "/media/andy1028/data1t/os_prj/github/debug_seq2seq/lib/nn_model/model.py", line 29, in get_nn_model
depth=1
File "build/bdist.linux-x86_64/egg/seq2seq/models.py", line 77, in SimpleSeq2Seq
File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 308, in add
output_tensor = layer(self.outputs[0])
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 514, in call
self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 572, in add_inbound_node
Node.create_node(self, inbound_layers, node_indices, tensor_indices)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 149, in create_node
output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0]))
File "build/bdist.linux-x86_64/egg/recurrentshop/engine.py", line 305, in call
File "build/bdist.linux-x86_64/egg/recurrentshop/engine.py", line 51, in
File "/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py", line 1175, in rnn
state_size = int(states[0].get_shape()[-1])
TypeError: int returned non-int (type NoneType)
andy1028@andy1028-Envy:/media/andy1028/data1t/os_prj/github/debug_seq2seq$

Overwrite the model

debug_seq2seq/lib/nn_model/model.py

Line 35 in eb2147f

model.save_weights(NN_MODEL_PATH)

Here an uninitialized model is stored and then loaded, potentially overwriting already existing models.

recurrentcontainer_2 shape mismatch

Tried to reproduce your results, but fail. After doing minor fixes re changed case, etc, I get:

INFO:lib.nn_model.train:Full-data-pass iteration num: 1
/home/akhavr/src/seq2seq/.env/local/lib/python2.7/site-packages/keras/models.py:610: UserWarning: The "show_accuracy" argument is deprecated, instead you should pass the "accuracy" metric to the model at compile time:
`model.compile(optimizer, loss, metrics=["accuracy"])`
  warnings.warn('The "show_accuracy" argument is deprecated, '
Traceback (most recent call last):
  File "bin/train.py", line 37, in <module>
    learn()
  File "bin/train.py", line 33, in learn
    train_model(nn_model, w2v_model, dialog_lines_for_nn, index_to_token)
  File "/home/akhavr/src/seq2seq/debug_seq2seq/lib/nn_model/train.py", line 87, in train_model
    nn_model.fit(X_train, Y_train, batch_size=TRAIN_BATCH_SIZE, nb_epoch=1, show_accuracy=True, verbose=1)
  File "/home/akhavr/src/seq2seq/.env/local/lib/python2.7/site-packages/keras/models.py", line 627, in fit
    sample_weight=sample_weight)
  File "/home/akhavr/src/seq2seq/.env/local/lib/python2.7/site-packages/keras/engine/training.py", line 1052, in fit
    batch_size=batch_size)
  File "/home/akhavr/src/seq2seq/.env/local/lib/python2.7/site-packages/keras/engine/training.py", line 983, in _standardize_user_data
    exception_prefix='model target')
  File "/home/akhavr/src/seq2seq/.env/local/lib/python2.7/site-packages/keras/engine/training.py", line 111, in standardize_input_data
    str(array.shape))
Exception: Error when checking model target: expected recurrentcontainer_2 to have shape (None, 6, 512) but got array with shape (32, 6, 20001)

What I'm doing wrong?

recurrentshop error

Reading git+git://github.com/datalogai/recurrentshop.git
Download error on git+git://github.com/datalogai/recurrentshop.git: unknown url type: git+git -- Some packages may not be found!
Reading https://pypi.python.org/simple/recurrentshop/
Couldn't find index page for 'recurrentshop' (maybe misspelled?)
Scanning index of all packages (this may take a while)
Reading https://pypi.python.org/simple/

SimpleSeq2seq typo

SimpleSeq2seq has updated to capital letter. This needs to be updated.

ValueError: Cannot create a tensor proto whose content is larger than 2GB.

hi there
I am trying to start learning this code, however after i download and install everything I try to run your example code but I am getting this error, could you please point me out my mistake?

INFO:lib.nn_model.model:Initializing NN model with the following params: INFO:lib.nn_model.model:Input dimension: 256 (token vector size) INFO:lib.nn_model.model:Hidden dimension: 512 INFO:lib.nn_model.model:Output dimension: 20001 (token dict size) INFO:lib.nn_model.model:Input seq length: 16 INFO:lib.nn_model.model:Output seq length: 6 INFO:lib.nn_model.model:Batch size: 32 Traceback (most recent call last): File "bin/train.py", line 36, in <module> learn() File "bin/train.py", line 29, in learn nn_model = get_nn_model(token_dict_size=len(index_to_token)) File "/Users/xiao/WorkSpace/Dev/3rd/debug_seq2seq/lib/nn_model/model.py", line 29, in get_nn_model depth=1 File "/Users/xiao/anaconda3/envs/dl/lib/python3.6/site-packages/seq2seq/models.py", line 81, in SimpleSeq2Seq output = decoder(x) File "/Users/xiao/anaconda3/envs/dl/lib/python3.6/site-packages/recurrentshop-1.0.0-py3.6.egg/recurrentshop/engine.py", line 452, in __call__ File "/Users/xiao/anaconda3/envs/dl/lib/python3.6/site-packages/recurrentshop-1.0.0-py3.6.egg/recurrentshop/engine.py", line 917, in num_states File "/Users/xiao/anaconda3/envs/dl/lib/python3.6/site-packages/recurrentshop-1.0.0-py3.6.egg/recurrentshop/engine.py", line 128, in num_states File "/Users/xiao/anaconda3/envs/dl/lib/python3.6/site-packages/recurrentshop-1.0.0-py3.6.egg/recurrentshop/cells.py", line 171, in build_model File "/Users/xiao/anaconda3/envs/dl/lib/python3.6/site-packages/keras/engine/base_layer.py", line 431, in __call__ self.build(unpack_singleton(input_shapes)) File "/Users/xiao/anaconda3/envs/dl/lib/python3.6/site-packages/keras/layers/core.py", line 861, in build constraint=self.kernel_constraint) File "/Users/xiao/anaconda3/envs/dl/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper return func(*args, **kwargs) File "/Users/xiao/anaconda3/envs/dl/lib/python3.6/site-packages/keras/engine/base_layer.py", line 252, in add_weight constraint=constraint) File "/Users/xiao/anaconda3/envs/dl/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 400, in variable v = tf.Variable(value, dtype=tf.as_dtype(dtype), name=name) File "/Users/xiao/anaconda3/envs/dl/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 235, in __init__ constraint=constraint) File "/Users/xiao/anaconda3/envs/dl/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 355, in _init_from_args initial_value, name="initial_value", dtype=dtype) File "/Users/xiao/anaconda3/envs/dl/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1014, in convert_to_tensor as_ref=False) File "/Users/xiao/anaconda3/envs/dl/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1104, in internal_convert_to_tensor ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) File "/Users/xiao/anaconda3/envs/dl/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 235, in _constant_tensor_conversion_function return constant(v, dtype=dtype, name=name) File "/Users/xiao/anaconda3/envs/dl/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 214, in constant value, dtype=dtype, shape=shape, verify_shape=verify_shape)) File "/Users/xiao/anaconda3/envs/dl/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py", line 496, in make_tensor_proto "Cannot create a tensor proto whose content is larger than 2GB.") ValueError: Cannot create a tensor proto whose content is larger than 2GB.

Machine reboots when model starts

When I'm running your code with python bin\train.py machine restarts.
After debugging I found out that it occurs in the SimpleSeq2Seq model creation.
Machine config:

Windows 10 x64
Intel Core i7-4790K 4.0 GHz
GeForce GTX 780 Ti

I used tensorflow 1.2 and theano 0.9.0 for keras backend and have got the same problem.

Other models start successfully (e.g. cifar10, mnist).

MemoryError when running train.py

G:\Anaconda2\lib\site-packages\gensim\utils.py:840: UserWarning: detected Windows; aliasing chunkize to chunkize_serial
warnings.warn("detected Windows; aliasing chunkize to chunkize_serial")
G:\Anaconda2\lib\site-packages\gensim\utils.py:1015: UserWarning: Pattern library is not installed, lemmatization won't be available.
warnings.warn("Pattern library is not installed, lemmatization won't be available.")
INFO:summa.preprocessing.cleaner:'pattern' package not found; tag filters are not available for English
Using Theano backend.
INFO:lib.dialog_processor:Loading corpus data...
INFO:lib.dialog_processor:H:/EclipseWorkspace/NetFault_Analysis/keras-seq2seq/debug_seq2seq-master/corpora_processed\movie_lines_cleaned_m1.txt and H:/EclipseWorkspace/NetFault_Analysis/keras-seq2seq/debug_seq2seq-master/words_index\w_idx_movie_lines_cleaned_m1.txt exist, loading files from disk
INFO:main:-----
INFO:lib.w2v_model.w2v:Loading model from H:/EclipseWorkspace/NetFault_Analysis/keras-seq2seq/debug_seq2seq-master/w2v_models\movie_lines_cleaned_w5_m1_v128.bin
INFO:gensim.utils:loading Word2Vec object from H:/EclipseWorkspace/NetFault_Analysis/keras-seq2seq/debug_seq2seq-master/w2v_models\movie_lines_cleaned_w5_m1_v128.bin
INFO:gensim.utils:setting ignored attribute syn0norm to None
INFO:gensim.utils:setting ignored attribute cum_table to None
INFO:gensim.utils:loaded H:/EclipseWorkspace/NetFault_Analysis/keras-seq2seq/debug_seq2seq-master/w2v_models\movie_lines_cleaned_w5_m1_v128.bin
INFO:lib.w2v_model.w2v:Model "movie_lines_cleaned_w5_m1_v128.bin" has been loaded.
INFO:main:-----
INFO:lib.nn_model.model:Initializing NN model with the following params:
INFO:lib.nn_model.model:Input dimension: 128 (token vector size)
INFO:lib.nn_model.model:Hidden dimension: 128
INFO:lib.nn_model.model:Output dimension: 20001 (token dict size)
INFO:lib.nn_model.model:Input seq length: 16
INFO:lib.nn_model.model:Output seq length: 6
INFO:lib.nn_model.model:Batch size: 32
G:\Anaconda2\lib\site-packages\keras\engine\topology.py:379: UserWarning: The regularizers property of layers/models is deprecated. Regularization losses are now managed via the losses layer/model property.
warnings.warn('The regularizers property of layers/models '
Traceback (most recent call last):
File "H:\EclipseWorkspace\NetFault_Analysis\keras-seq2seq\debug_seq2seq-master\bin\train.py", line 37, in
learn()
File "H:\EclipseWorkspace\NetFault_Analysis\keras-seq2seq\debug_seq2seq-master\bin\train.py", line 30, in learn
nn_model = get_nn_model(token_dict_size=len(index_to_token))
File "H:\EclipseWorkspace\NetFault_Analysis\keras-seq2seq\debug_seq2seq-master\lib\nn_model\model.py", line 30, in get_nn_model
depth=1
File "build\bdist.win-amd64\egg\seq2seq\models.py", line 73, in SimpleSeq2Seq
File "build\bdist.win-amd64\egg\recurrentshop\engine.py", line 198, in add
File "G:\Anaconda2\lib\site-packages\keras\models.py", line 332, in add
output_tensor = layer(self.outputs[0])
File "G:\Anaconda2\lib\site-packages\keras\engine\topology.py", line 546, in call
self.build(input_shapes[0])
File "build\bdist.win-amd64\egg\recurrentshop\cells.py", line 121, in build
File "build\bdist.win-amd64\egg\recurrentshop\engine.py", line 83, in init
File "G:\Anaconda2\lib\site-packages\keras\initializations.py", line 95, in orthogonal
a = np.random.normal(0.0, 1.0, flat_shape)
File "mtrand.pyx", line 1636, in mtrand.RandomState.normal (numpy\random\mtrand\mtrand.c:20676)
File "mtrand.pyx", line 242, in mtrand.cont2_array_sc (numpy\random\mtrand\mtrand.c:7401)
MemoryError

I'm using anaconda2(python2 win 64 bit cpu only) and my colleague is using ubuntu with GPU (python 2). Does anybody meet this problem? How can I solve this? Many thanks.