hadoopit / rnn-nlu Goto Github PK

View Code? Open in Web Editor NEW

483.0 39.0 173.0 31 KB

A TensorFlow implementation of Recurrent Neural Networks for Sequence Classification and Sequence Labeling

Perl 15.82% Python 84.18%

sequence-labeling sequence-classification recurrent-neural-networks tensorflow attention slot-filling intent-detection

rnn-nlu's People

Contributors

Stargazers

Watchers

Forkers

ivysoftware dongfangyixi jankim itsbalamurali javelir gom7745 jayvischeng sandy4321 yinying fangzheng354 allensmile leezqcst benjamesbabala xyz8 vangogh0318 hariom-yadaw xchuwenbo hikylemorris mars-wei ericg108 shdut bruce808 zjh-nudger darrenyaoyao ron9413 sungjinlees zjmhfut smhaisale elias-1 dinotuku zuxfoucault fancycheung karthi2016 hemantlive yangqiokay nrvnujd johndpope haowangcasia rollingstone oceanos74 r-wheeler mindis fanfanfeng alfredfrancis yzx1992 simplejian dwang68 colinsongf ivanvera yushuai chenerg javadhelali wealthcode mehdimashayekhi angelaying xinqiyang woniuhu iamhuy samithaj galaxyh ginhit milesqli yangvict yangyang233 corey-a-rizziwise liushui9404 dongcin fengyin123 juneyan helicqin xiaoduozhou alicebupt asadhanif cutecha tap222 greatarthur yongfu-li fadwaalazzo atefehmorsali archeryi praggie godsme yanzi1225627 caoxu915683474 judelee19 yaduvanshiankitofficial zhaohe1995 arynas rapheallong lulzzz licshire zsgchinese yuthzi pokbe gxdalu-yaya mihailsalnikov yxlin1 ttslr nicemartin soheeyang

rnn-nlu's Issues

how to impove predict result?

HI, i used your code and train a model. When predict with my test data, intent result seems good, but tagging task seems worse comparing to other mate.
I changed some flags parameters, such as double 'batch_size',double 'word_embedding_size', double 'max_training_steps',double 'num_layers',no use. Can you give me other tip? :)

thanks!

Linear is expecting 2D arguments: [[2, None, 128]] with --bidirectionnal_rnn=True argument

Hi,

When using --bidirectionnal_rnn=True argument, I get the following output :

/usr/bin/python2.7 /home/pldelisl/Downloads/rnn-nlu-master/run_multi-task_rnn.py --data_dir=/home/pldelisl/Downloads/rnn-nlu-master/data/ATIS_samples --max_sequence_length=50 --task=joint --bidirectional_rnn=True --train_dir=model_tmp
Applying Parameters:
word_embedding_size: 128
task: joint
data_dir: /home/pldelisl/Downloads/rnn-nlu-master/data/ATIS_samples
in_vocab_size: 10000
dropout_keep_prob: 0.5
train_dir: model_tmp
num_layers: 1
max_gradient_norm: 5.0
batch_size: 16
out_vocab_size: 10000
use_attention: True
max_sequence_length: 50
bidirectional_rnn: True
steps_per_checkpoint: 300
max_train_data_size: 0
max_training_steps: 10000
max_test_data_size: 0
size: 128
Preparing data in /home/pldelisl/Downloads/rnn-nlu-master/data/ATIS_samples
Max sequence length: 50.
Creating 1 layers of 128 units.
Use the attention RNN model
Traceback (most recent call last):
  File "/home/pldelisl/Downloads/rnn-nlu-master/run_multi-task_rnn.py", line 356, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))
  File "/home/pldelisl/Downloads/rnn-nlu-master/run_multi-task_rnn.py", line 353, in main
    train()
  File "/home/pldelisl/Downloads/rnn-nlu-master/run_multi-task_rnn.py", line 230, in train
    model, model_test = create_model(sess, len(vocab), len(tag_vocab), len(label_vocab))
  File "/home/pldelisl/Downloads/rnn-nlu-master/run_multi-task_rnn.py", line 183, in create_model
    task=task)
  File "/home/pldelisl/Downloads/rnn-nlu-master/multi_task_model.py", line 89, in __init__
    buckets, softmax_loss_function=softmax_loss_function, use_attention=use_attention)
  File "/home/pldelisl/Downloads/rnn-nlu-master/seq_labeling.py", line 269, in generate_sequence_output
    use_attention=use_attention)
  File "/home/pldelisl/Downloads/rnn-nlu-master/seq_labeling.py", line 121, in attention_RNN
    initial_state = rnn_cell._linear(encoder_state, output_size, True)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn_cell.py", line 892, in _linear
    raise ValueError("Linear is expecting 2D arguments: %s" % str(shapes))
ValueError: Linear is expecting 2D arguments: [[2, None, 128]]

But I don't get it when using --bidirectionnal_rnn=False

Is this error normal ?

Thank you very much.

how to use pre-train word embedding

License

What license is this made available under? A popular choice is MIT

Linear is expecting 2D arguments: [[2, None, 128]] with num_layers=2

Hi, I got this error in https://github.com/HadoopIt/rnn-nlu/blob/master/seq_classification.py#L71

I config the num_layers=2 for MultiRNN, this error occurs.

When I use "num_layers=1" or "num_layers=2 and state_is_tuple=False", the code is fine.

I think there needs some modification for encoder_state, but I'm not quite familiar with rnn in tensorflow.

Would you mind fix this bug?

Thanks a lot

About the models.

Is there only one model in the codes? The paper proposed two usages about the attention mechanism but i just found out the usage in Encoder-Decoder model. @HadoopIt

some error about code

First, thanks for your code
when I train the model, I met an error about iteritems and terminal say: "AttributeError: 'dict' object has no attribute 'iteritems' "
How can I solve the problem
thanks

load into tensorflow.js

I'm trying to convert and load the model into tensorflow.js

#TODO --output_node_names='model_tmp/checkpoint' \
tensorflowjs_converter \
    --input_format=tf_saved_model \
    --saved_model_tags=serve \
    model_tmp/model.ckpt-500 \
    model_web

https://github.com/OpenASR/rnn-nlu/blob/master/scripts/convert

When I run the script above I get an error:

IOError: SavedModel file does not exist at: model_tmp/model.ckpt-500

I've tried model_tmp, model_tmp/checkpoint but neither of them work either.

Also, I'm not sure what to provide at output_node_names

https://github.com/tensorflow/tfjs-converter

Slow spinup time

Hi guys,

been running a slightly modified version of the code on my (admittedly slow) Macbook Air 2013.

Now I am wondering: is it normal for the declaration of the training ops (tf.train.AdamOptimizer, tf.gradients, tf.clip_by_global_norm, tf.train.AdamOptimizerapply_gradients) to take a combined 11 minutes (or anything in that order of magnitude)? Been downsizing the layer_size to 16 as well, same effect. This effects the development workflow.

Would be thankful for any hints, because testing other parts of the code with this taking so long is very time-consuming.

Best,
Anthony

EDIT:
Could this be caused by the Mac needing to allocate virtual memory space? Because I only have 4gb and the model consumes more than that in my current setup.

Restoring model from checkpoint

Hi @HadoopIt ,

Thank you for publishing the code for the paper.
I am trying to use a stored pre-trained model to generate the intent and slots for a new sentence. However, based on the outputs it generates, it ends up using a new, untrained model.

saver = tf.train.import_meta_graph('/tmp/model.ckpt-1900.meta')
saver.restore(session, '/tmp/model.ckpt-1900')

model_train, model_test = create_model(session, 139, 36, 6)
step_outputs  = model_test.joint_step(session, encoder_inputs, tags, tag_weights, labels,sequence_length, bucket_id, True)

Any suggestions on how to use a trained model from a stored file?

need the dataset

global step 29600 step-time 0.09. Training perplexity 1.00
Eval accuracy: 97.21 976/1004
Test accuracy: 96.08 858/893

I did not run the good scores in intent.
I think my dataset maybe has some problems, could you give me your dataset?
Thanks.
email: [email protected]

How to turn on proper bucket support?

Hi, thank you for the code.
How to turn on properly a support for buckets of different size?
This model works properly only with 1 bucket size right now.

May I Expect to see how I do prediction?

Sorry for opening this on issue page. However, this is not an issue.
I was expecting the prediction after finishing the training iterations.
Since I have a model, I was wondering how I could predict a given sentence as a sample.

How to reproduce the results of the paper?

Hi, thanks for the great code.

I tried running your code on the ATIS data in https://github.com/yvchen/JointSLU/tree/master/data, and got accuracy 96.75 and F1 94.42 after training for 8400 steps. (I replaced the digits in the text with digit*n, where n is the length of the digit sequence)
However, there is still a gap between this result and the published results.

My questions are:
Is this result reasonable for the published code?
What else should I do to reproduce the published results, except for implementing the tag dependency? Are there any important tricks?
Should I use the default hyper-parameters in the code, or another set of hyper-parameters?

Thanks a lot.

when i run it in docker, it throws an err.

Use of uninitialized value in printf at /Users/huangpeisong/Desktop/rnn-nlu/conlleval.pl line 229, line 74.
Use of uninitialized value in printf at /Users/huangpeisong/Desktop/rnn-nlu/conlleval.pl line 229, line 74.

docker images: tensorflow/tensorflow:1.2.0

some error in the function _step of seq_labeling.py

Refer to code .

The logit set shape from itself. The code may be writed as this
logit.set_shape(zero_logit.get_shape())

May some help.

Why not use dynamic rnn?

Hi, I think dynamic rnn is more convenient to use, why use static?
@HadoopIt

Update with the recent changes in Tensorflow

Due to this change in Tensorflow as the suggestion:

writing: MultiRNNCell([lstm] * 5) will now build a 5-layer LSTM stack where each
layer shares the same parameters. To get 5 layers each with their own
parameters, write: MultiRNNCell([LSTMCell(...) for _ in range(5)]).

Should cell = tf.contrib.rnn.MultiRNNCell([single_cell] * num_layers]) in line be updated to cell = tf.contrib.rnn.MultiRNNCell([single_cell for _ in range(num_layers)])?
Thanks!

I can't find any code about the [red line] in figure3.

Excuse me！
I have read the paper, and find the current predicted tag was pointed to the next step as described in Figure3.

But in this code, I cant find any code about this operation.

Is your picture wrong?

Looking forward your reply！
Thank you very much！