jayparks / tf-seq2seq Goto Github PK

Sequence to sequence learning using TensorFlow.

Python 51.01% Jupyter Notebook 13.55% Perl 18.75% Shell 1.29% Smalltalk 1.24% Emacs Lisp 11.15% JavaScript 0.55% NewLisp 1.04% Ruby 1.08% Slash 0.23% SystemVerilog 0.12%

tensorflow seq2seq sequence-to-sequence neural-machine-translation nmt encoder-decoder machine-learning deep-learning neural-network natural-language-processing

tf-seq2seq's Introduction

TF-seq2seq

Sequence to sequence (seq2seq) learning Using TensorFlow.

The core building blocks are RNN Encoder-Decoder architectures and Attention mechanism.

The package was largely implemented using the latest (1.2) tf.contrib.seq2seq modules

AttentionWrapper
Decoder
BasicDecoder
BeamSearchDecoder

The package supports

Multi-layer GRU/LSTM
Residual connection
Dropout
Attention and input_feeding
Beamsearch decoding
Write n-best list

Dependencies

NumPy >= 1.11.1
Tensorflow >= 1.2

History

June 5, 2017: Major update
June 6, 2017: Supports batch beamsearch decoding
June 11, 2017: Separted training / decoding
June 22, 2017: Supports tf.1.2 (contrib.rnn -> python.ops.rnn_cell)

Usage Instructions

Data Preparation

To preprocess raw parallel data of sample_data.src and sample_data.trg, simply run

cd data/
./preprocess.sh src trg sample_data ${max_seq_len}

Running the above code performs widely used preprocessing steps for Machine Translation (MT).

Normalizing punctuation
Tokenizing
Bytepair encoding (# merge = 30000) (Sennrich et al., 2016)
Cleaning sequences of length over ${max_seq_len}
Shuffling
Building dictionaries

Training

To train a seq2seq model,

$ python train.py   --cell_type 'lstm' \ 
                    --attention_type 'luong' \
                    --hidden_units 1024 \
                    --depth 2 \
                    --embedding_size 500 \
                    --num_encoder_symbols 30000 \
                    --num_decoder_symbols 30000 ...

Decoding

To run the trained model for decoding,

$ python decode.py  --beam_width 5 \
                    --decode_batch_size 30 \
                    --model_path $PATH_TO_A_MODEL_CHECKPOINT (e.g. model/translate.ckpt-100) \
                    --max_decode_step 300 \
                    --write_n_best False
                    --decode_input $PATH_TO_DECODE_INPUT
                    --decode_output $PATH_TO_DECODE_OUTPUT

If --beam_width=1, greedy decoding is performed at each time-step.

Arguments

Data params

--source_vocabulary : Path to source vocabulary
--target_vocabulary : Path to target vocabulary
--source_train_data : Path to source training data
--target_train_data : Path to target training data
--source_valid_data : Path to source validation data
--target_valid_data : Path to target validation data

Network params

--cell_type : RNN cell to use for encoder and decoder (default: lstm)
--attention_type : Attention mechanism (bahdanau, luong), (default: bahdanau)
--depth : Number of hidden units for each layer in the model (default: 2)
--embedding_size : Embedding dimensions of encoder and decoder inputs (default: 500)
--num_encoder_symbols : Source vocabulary size to use (default: 30000)
--num_decoder_symbols : Target vocabulary size to use (default: 30000)
--use_residual : Use residual connection between layers (default: True)
--attn_input_feeding : Use input feeding method in attentional decoder (Luong et al., 2015) (default: True)
--use_dropout : Use dropout in rnn cell output (default: True)
--dropout_rate : Dropout probability for cell outputs (0.0: no dropout) (default: 0.3)

Training params

--learning_rate : Number of hidden units for each layer in the model (default: 0.0002)
--max_gradient_norm : Clip gradients to this norm (default 1.0)
--batch_size : Batch size
--max_epochs : Maximum training epochs
--max_load_batches : Maximum number of batches to prefetch at one time.
--max_seq_length : Maximum sequence length
--display_freq : Display training status every this iteration
--save_freq : Save model checkpoint every this iteration
--valid_freq : Evaluate the model every this iteration: valid_data needed
--optimizer : Optimizer for training: (adadelta, adam, rmsprop) (default: adam)
--model_dir : Path to save model checkpoints
--model_name : File name used for model checkpoints
--shuffle_each_epoch : Shuffle training dataset for each epoch (default: True)
--sort_by_length : Sort pre-fetched minibatches by their target sequence lengths (default: True)

Decoding params

--beam_width : Beam width used in beamsearch (default: 1)
--decode_batch_size : Batch size used in decoding
--max_decode_step : Maximum time step limit in decoding (default: 500)
--write_n_best : Write beamsearch n-best list (n=beam_width) (default: False)
--decode_input : Input file path to decode
--decode_output : Output file path of decoding output

Runtime params

--allow_soft_placement : Allow device soft placement
--log_device_placement : Log placement of ops on devices

Acknowledgements

The implementation is based on following projects:

nematus: Theano implementation of Neural Machine Translation. Major reference of this project
subword-nmt: Included subword-unit scripts to preprocess input data
moses: Included preprocessing scripts to preprocess input data
tf.seq2seq_legacy Legacy Tensorflow seq2seq tutorial
tf_tutorial_plus: Nice tutorials for tf.contrib.seq2seq API

For any comments and feedbacks, please email me at [email protected] or open an issue here.

tf-seq2seq's People

Contributors

Stargazers

Watchers

Forkers

gomson xaveng peratham pedrobalage kelizhong chulakar damingyang p63gonome3 fsxfreak sxdkxgwan fskyml yzx1992 elnazdavoodi virajadduru linzai1992 yfliao kudep huibinr dp1310 soprof chengka7 hpk23 yiyepiaoling0715 jankim xblaster challenging zhuwenxiao winggy fydlzr benjamesbabala subho406 chandreshiit sammy4321 fence adedzy ngohoanhkhoa hustercn cppowboy gds123 arieszhang1994 s4sarath vishwajeet93 winnechan sunnymarkliu zhangxuemiao yeahestherchan ml-ai-nlp-ir zhongxia96 jiths xumine narchontis nays850 mukhal kairobo matteopagliari aiedward matteo-pagliari sevinjyolchuyeva fangpings lgdkobe24 germey hccho2 caoxu915683474 jasonluo-tw fanfanba garylms oliviershi giteshkhanna ameyem-skill-labs afcarl leeyangg fengsee tungk mulinfro kailiwu scottwang96 jyonn mrg7 whidbey watereals shashankg7 flamit yy77806773 czyssrs envibus navpreetsamra tianjiangood lity3lenovo psds01 alexwgr dkdl012 littttttlebird ankur287 johnsonhit muhammedabdelnasser phaniram-sayapaneni debuluoyi linloong lzswangjian xiedake

tf-seq2seq's Issues

Does the BeamSearchDecoder work well?

Hi, I am also using r1.2 to implement beam search decoder, but I didn't get the correct results. greedy searching works well. Did you get the correct results when using beam search decoder? Thanks.

what are data params?

preprocessing created a bunch of files. which of these files are data params and which data params are required??

--source_vocabulary : Path to source vocabulary
--target_vocabulary : Path to target vocabulary
--source_train_data : Path to source training data
--target_train_data : Path to target training data
--source_valid_data : Path to source validation data
--target_valid_data : Path to target validation data

can you show how to train a model using train.py with all data params required for training?

Problem using attention wrapper.

am getting issue related to miss match of state and output. But I am unable to figure the issue.
It would be really appreciated if someone can guide me. Thanks in advance.
I am using tensorfow-gpu==1.2.1, with 1080 Ti graphics.

Error is as below:
ValueError: Shapes (8, 522) and (8, 512) are incompatible

Error occurs in the file "attention_wrapper.py" in the method named "call" at line 708

cell_output, next_cell_state = self._cell(cell_inputs, cell_state)

I was able to figure out that it is adding the attention_size to the shape and so there is a mismatch.
But I have no idea how to fix it.
The code is as below, hyper-parameters are declared as below (test purpose).
`
batch_size= 8
number_of_units_per_layer= 512
number_of_layers = 3
attn_size= 10
def build_decoder_cell(enc_output, enc_state, source_sequence_length, attn_size, batch_size):

encoder_outputs = enc_output
encoder_last_state = enc_state
encoder_inputs_length = source_sequence_length

attention_mechanism = attention_wrapper.LuongAttention(
        num_units=attn_size, memory=encoder_outputs,
        memory_sequence_length=encoder_inputs_length,
        scale=True,
        name='LuongAttention' )

# Building decoder_cell
decoder_cell_list = [
    build_single_cell() for i in range(num_layers)]

decoder_initial_state = encoder_last_state

def attn_decoder_input_fn(inputs, attention):
    #if not self.attn_input_feeding:
    #    return inputs

    # Essential when use_residual=True
    _input_layer = Dense(size, dtype=tf.float32,
                        name='attn_input_feeding')
    return _input_layer(array_ops.concat([inputs, attention], -1))


# AttentionWrapper wraps RNNCell with the attention_mechanism
# Note: We implement Attention mechanism only on the top decoder layer
decoder_cell_list[-1] = attention_wrapper.AttentionWrapper(
    cell=decoder_cell_list[-1],
    attention_mechanism=attention_mechanism,
    attention_layer_size=attn_size,
    #cell_input_fn=attn_decoder_input_fn,
    initial_cell_state=encoder_last_state[-1],
    alignment_history=False,
    name='Attention_Wrapper')

# To be compatible with AttentionWrapper, the encoder last state
# of the top layer should be converted into the AttentionWrapperState form
# We can easily do this by calling AttentionWrapper.zero_state

# Also if beamsearch decoding is used, the batch_size argument in .zero_state
# should be ${decoder_beam_width} times to the origianl batch_size
#batch_size = self.batch_size if not self.use_beamsearch_decode \
#    else self.batch_size * self.beam_width
initial_state = [state for state in encoder_last_state]

initial_state[-1] = decoder_cell_list[-1].zero_state(
    batch_size=batch_size, dtype=tf.float32)
decoder_initial_state = tuple(initial_state)

return tf.contrib.rnn.MultiRNNCell(decoder_cell_list), decoder_initial_state`

Thank you once again.

attn_input_feeding

In the documentation, it is suggested to make attn_input_feeding =True during decoding.
But in the code, I don't see any place where it is set to True during decoding.

The configuration is all read from the dump formed during training and since it was set False during training, the attn_input_feeding remains False even during decoding.

Am i missing something?

rnn moves back into core layer, should update for tf1.2?

Hi, this repo is userful, I just can't find a really successfully train and run implementation of seq2seq make full use of newest tensorflow apis. I check the code, seems rnn package moves back into core layer instead of contrib part. Did this tested on tensorflow1.2? Just report this to catch this repo up to the edge stage.

Missing file 'data/europarl-v7.1.4M.de'

Hi,

I got the file missing error. Was it deleted?
No such file or directory: 'data/europarl-v7.1.4M.de'

Thanks.

representation of the end token?

I'm a beginner in deep learning. Can i ask if the following codes are trying to add the decoder_end_token to all the decoder targets with the end_token full of 1s?
When we calculate the sequence loss, the crossentropy between the end_token and corresponding y_hat could be very large?
decoder_end_token = tf.ones(shape=[self.batch_size, 1], dtype=tf.int32) * data_utils.end_token
self.decoder_targets_train = tf.concat([self.decoder_inputs,decoder_end_token], axis=1)
self.loss = seq2seq.sequence_loss(logits=self.decoder_logits_train, targets=self.decoder_targets_train, weights=masks, average_across_timesteps=True, average_across_batch=True,)

when i test this code with the dataset of europarl-v7.1.4M.en , the decoder result is not true, most line is <unk>

when i test this code with the dataset of europarl-v7.1.4M.en , the decoder result is not true, most line is .i am confused.

I have solved.

How can I use pre-embedded data in this model?

I'm trying to use pre-embedded data as input. It means I don't want to use embedding layer of model.
How should I do this? Need for help.

Does the BeamSearchDecoder work well

BPE and validation data

Can you clarify a bit on how you are taking the validation data and why bpe is used in such cases ?

tf.app.flags.DEFINE_string('source_valid_data', 'data/newstest2012.bpe.de', 'Path to source validation data')
tf.app.flags.DEFINE_string('target_valid_data', 'data/newstest2012.bpe.fr', 'Path to target validation data')

what's the format of sample_data.src and sample_data.trg?

can this code work?

I found this seq2seq project is quite helpful, which is based on the latest tensorflow 1.2. May I know whether your source code really works, as I found in your notebook that there is error arisen.

Could you provide the sample data?

Thank you very much for your contribution.
I am not familiar with NMT.
Could you provide me the address for downloading the sample.src and sample.trg?

ResourceExhausted error

While training, resource exhausted error. decreased hidden units batch size and num_enc/dec_units still no change. it is able to detect GPU -
2017-12-15 17:50:40.266983: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties:
name: GeForce GTX 980 Ti major: 5 minor: 2 memoryClockRate(GHz): 1.076
pciBusID: 0000:01:00.0
totalMemory: 5.93GiB freeMemory: 5.83GiB
2017-12-15 17:50:40.267067: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 980 Ti, pci bus id: 0000:01:00.0, compute capability: 5.2)
building model..
building encoder..
building decoder and attention..
setting optimizer..
Created new model parameters..
Training..

But after this following is the trackback of error

2017-12-15 17:45:48.103510: W tensorflow/core/common_runtime/bfc_allocator.cc:277] ****************************************************************************************************
2017-12-15 17:45:48.103914: W tensorflow/core/framework/op_kernel.cc:1192] Resource exhausted: OOM when allocating tensor with shape[500,15000]
2017-12-15 17:45:48.104030: W tensorflow/core/framework/op_kernel.cc:1192] Resource exhausted: OOM when allocating tensor with shape[500,15000]
[[Node: decoder/gradients/decoder/decoder/while/Select_1_grad/zeros_like = ZerosLike[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](decoder/gradients/decoder/decoder/while/Select_1_grad/zeros_like/Enter, ^decoder/gradients/Sub)]]
Traceback (most recent call last):
File "train.py", line 227, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "train.py", line 223, in main
train()
File "train.py", line 149, in train
decoder_inputs=target, decoder_inputs_length=target_len)
File "/DATA/USERS/sai/residual/tf-seq2seq/seq2seq_model.py", line 470, in train
outputs = sess.run(output_feed, input_feed)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 889, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1120, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1317, in _do_run
options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1336, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[500,15000]
[[Node: decoder/gradients/decoder/decoder/while/Select_1_grad/zeros_like = ZerosLike[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](decoder/gradients/decoder/decoder/while/Select_1_grad/zeros_like/Enter, ^decoder/gradients/Sub)]]
[[Node: decoder/gradients/decoder/decoder/while/BasicDecoderStep/TrainingHelperNextInputs/cond/Merge_grad/cond_grad/_191 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1168_decoder/gradients/decoder/decoder/while/BasicDecoderStep/TrainingHelperNextInputs/cond/Merge_grad/cond_grad", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Caused by op u'decoder/gradients/decoder/decoder/while/Select_1_grad/zeros_like', defined at:
File "train.py", line 227, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "train.py", line 223, in main
train()
File "train.py", line 125, in train
model = create_model(sess, FLAGS)
File "train.py", line 74, in create_model
model = Seq2SeqModel(config, 'train')
File "/DATA/USERS/sai/residual/tf-seq2seq/seq2seq_model.py", line 68, in init
self.build_model()
File "/DATA/USERS/sai/residual/tf-seq2seq/seq2seq_model.py", line 77, in build_model
self.build_decoder()
File "/DATA/USERS/sai/residual/tf-seq2seq/seq2seq_model.py", line 235, in build_decoder
self.init_optimizer()
File "/DATA/USERS/sai/residual/tf-seq2seq/seq2seq_model.py", line 412, in init_optimizer
gradients = tf.gradients(self.loss, trainable_params)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gradients_impl.py", line 581, in gradients
grad_scope, op, func_call, lambda: grad_fn(op, *out_grads))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gradients_impl.py", line 353, in _MaybeCompile
return grad_fn() # Exit early
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gradients_impl.py", line 581, in
grad_scope, op, func_call, lambda: grad_fn(op, *out_grads))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/math_grad.py", line 907, in _SelectGrad
zeros = array_ops.zeros_like(x)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/array_ops.py", line 1495, in zeros_like
return gen_array_ops._zeros_like(tensor, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 5960, in _zeros_like
"ZerosLike", x=x, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2956, in create_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1470, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

...which was originally created as op u'decoder/decoder/while/Select_1', defined at:
File "train.py", line 227, in
tf.app.run()
[elided 5 identical lines from previous traceback]
File "/DATA/USERS/sai/residual/tf-seq2seq/seq2seq_model.py", line 77, in build_model
self.build_decoder()
File "/DATA/USERS/sai/residual/tf-seq2seq/seq2seq_model.py", line 210, in build_decoder
maximum_iterations=max_decoder_length))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/seq2seq/python/ops/decoder.py", line 286, in dynamic_decode
swap_memory=swap_memory)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2816, in while_loop
result = loop_context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2640, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2590, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/seq2seq/python/ops/decoder.py", line 253, in body
zero_outputs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/util/nest.py", line 413, in map_structure
structure[0], [func(*x) for x in entries])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/seq2seq/python/ops/decoder.py", line 251, in
lambda out, zero: array_ops.where(finished, zero, out),
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/array_ops.py", line 2441, in where
return gen_math_ops._select(condition=condition, t=x, e=y, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 3988, in _select
"Select", condition=condition, t=t, e=e, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[500,15000]
[[Node: decoder/gradients/decoder/decoder/while/Select_1_grad/zeros_like = ZerosLike[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](decoder/gradients/decoder/decoder/while/Select_1_grad/zeros_like/Enter, ^decoder/gradients/Sub)]]
[[Node: decoder/gradients/decoder/decoder/while/BasicDecoderStep/TrainingHelperNextInputs/cond/Merge_grad/cond_grad/_191 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1168_decoder/gradients/decoder/decoder/while/BasicDecoderStep/TrainingHelperNextInputs/cond/Merge_grad/cond_grad", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Get error when initializing decoder initial state

In seq2seq_model.py file,
I use bi-directional GRU for encoder but I got an error.
More specifically,in line 391, i got an error as follows

"TypeError: Tensor objects are not iterable when eager execution is not enabled. To iterate over this tensor use tf.map_fn."

I use tensorflow 1.7 How can I solve this problem?

In addition, why do have to initialize last decoder cell to zero state not encoder last state as preceding layer?

Thanks in advance

Can anybody provide a guidance on how to run the code?

many thanks!

Beam Search: Error in attn_decoder_input_fn in concat statement

https://github.com/JayParks/tf-seq2seq/blob/master/seq2seq_model.py#L368
It gives that the dimension 0 of inputs and attention do not match (as we are tile_batching it to batch_size * beam_width). Didn't you get any error while running with beam_search?

decoder_initial_state

Is this some left-over from previous versions of the code ?
https://github.com/JayParks/tf-seq2seq/blob/master/seq2seq_model.py#L359

Gets overwritten at #L393

If attention was not used, decoder_initial_state would simply be (a tiled) encoder_last_state ?

unsupported operand type(s) for -: 'float' and 'Flag'

Does it support different depth for encoder and decoder

Hi,
when I review this code, I found the depth of encoder and decoder must be the same.
as " self.depth = config['depth']" which is used in contructing encoder and decoder.
And I try to set different depths, just get the following error:

Traceback (most recent call last):
File "train.py", line 301, in
tf.app.run()
File "/home/aldy/.local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "train.py", line 297, in main
train()
File "train.py", line 145, in train
model = create_model(sess, FLAGS)
File "train.py", line 82, in create_model
model = Seq2SeqModel(config, 'train')
File "/home/aldy/work/nmt/tf-seq/tf-seq2seq-master/seq2seq_model.py", line 72, in init
self.build_model()
File "/home//aldy/work/nmt/tf-seq/tf-seq2seq-master/seq2seq_model.py", line 81, in build_model
self.build_decoder()
File "/home/aldy/work/nmt/tf-seq/tf-seq2seq-master/seq2seq_model.py", line 287, in build_decoder
maximum_iterations=max_decoder_length))
File "/home//aldy/.local/lib/python2.7/site-packages/tensorflow/contrib/seq2seq/python/ops/decoder.py", line 286, in dynamic_decode
swap_memory=swap_memory)
File "/home/aldy/.local/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2816, in while_loop
result = loop_context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "/home/aldy/.local/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2640, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/home//aldy/.local/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2590, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "/home//aldy/.local/lib/python2.7/site-packages/tensorflow/contrib/seq2seq/python/ops/decoder.py", line 234, in body
decoder_finished) = decoder.step(time, inputs, state)
File "/home/aldy/.local/lib/python2.7/site-packages/tensorflow/contrib/seq2seq/python/ops/basic_decoder.py", line 138, in step
cell_outputs, cell_state = self._cell(inputs, state)
File "/home/aldy/.local/lib/python2.7/site-packages/tensorflow/python/ops/rnn_cell_impl.py", line 183, in call
return super(RNNCell, self).call(inputs, state)
File "/home/aldy/.local/lib/python2.7/site-packages/tensorflow/python/layers/base.py", line 575, in call
outputs = self.call(inputs, *args, **kwargs)
File "/home/aldy/.local/lib/python2.7/site-packages/tensorflow/python/ops/rnn_cell_impl.py", line 1066, in call
cur_inp, new_state = cell(cur_inp, cur_state)
File "/home/aldy/.local/lib/python2.7/site-packages/tensorflow/python/ops/rnn_cell_impl.py", line 951, in call
outputs, new_state = self._cell(inputs, state, scope=scope)
File "/home/aldy/.local/lib/python2.7/site-packages/tensorflow/python/ops/rnn_cell_impl.py", line 891, in call
output, new_state = self._cell(inputs, state, scope)
File "/home/aldy/.local/lib/python2.7/site-packages/tensorflow/python/ops/rnn_cell_impl.py", line 183, in call
return super(RNNCell, self).call(inputs, state)
File "/home/aldy/.local/lib/python2.7/site-packages/tensorflow/python/layers/base.py", line 575, in call
outputs = self.call(inputs, *args, **kwargs)
File "/home/aldy/.local/lib/python2.7/site-packages/tensorflow/python/ops/rnn_cell_impl.py", line 591, in call
(c_prev, m_prev) = state
ValueError: too many values to unpack

Any way, do you have any suggetion to solve this problem ?