hadoopit / rnn-nlu Goto Github PK
View Code? Open in Web Editor NEWA TensorFlow implementation of Recurrent Neural Networks for Sequence Classification and Sequence Labeling
A TensorFlow implementation of Recurrent Neural Networks for Sequence Classification and Sequence Labeling
HI, i used your code and train a model. When predict with my test data, intent result seems good, but tagging task seems worse comparing to other mate.
I changed some flags parameters, such as double 'batch_size',double 'word_embedding_size', double 'max_training_steps',double 'num_layers',no use. Can you give me other tip? :)
thanks!
Hi,
When using --bidirectionnal_rnn=True argument, I get the following output :
/usr/bin/python2.7 /home/pldelisl/Downloads/rnn-nlu-master/run_multi-task_rnn.py --data_dir=/home/pldelisl/Downloads/rnn-nlu-master/data/ATIS_samples --max_sequence_length=50 --task=joint --bidirectional_rnn=True --train_dir=model_tmp
Applying Parameters:
word_embedding_size: 128
task: joint
data_dir: /home/pldelisl/Downloads/rnn-nlu-master/data/ATIS_samples
in_vocab_size: 10000
dropout_keep_prob: 0.5
train_dir: model_tmp
num_layers: 1
max_gradient_norm: 5.0
batch_size: 16
out_vocab_size: 10000
use_attention: True
max_sequence_length: 50
bidirectional_rnn: True
steps_per_checkpoint: 300
max_train_data_size: 0
max_training_steps: 10000
max_test_data_size: 0
size: 128
Preparing data in /home/pldelisl/Downloads/rnn-nlu-master/data/ATIS_samples
Max sequence length: 50.
Creating 1 layers of 128 units.
Use the attention RNN model
Traceback (most recent call last):
File "/home/pldelisl/Downloads/rnn-nlu-master/run_multi-task_rnn.py", line 356, in <module>
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "/home/pldelisl/Downloads/rnn-nlu-master/run_multi-task_rnn.py", line 353, in main
train()
File "/home/pldelisl/Downloads/rnn-nlu-master/run_multi-task_rnn.py", line 230, in train
model, model_test = create_model(sess, len(vocab), len(tag_vocab), len(label_vocab))
File "/home/pldelisl/Downloads/rnn-nlu-master/run_multi-task_rnn.py", line 183, in create_model
task=task)
File "/home/pldelisl/Downloads/rnn-nlu-master/multi_task_model.py", line 89, in __init__
buckets, softmax_loss_function=softmax_loss_function, use_attention=use_attention)
File "/home/pldelisl/Downloads/rnn-nlu-master/seq_labeling.py", line 269, in generate_sequence_output
use_attention=use_attention)
File "/home/pldelisl/Downloads/rnn-nlu-master/seq_labeling.py", line 121, in attention_RNN
initial_state = rnn_cell._linear(encoder_state, output_size, True)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn_cell.py", line 892, in _linear
raise ValueError("Linear is expecting 2D arguments: %s" % str(shapes))
ValueError: Linear is expecting 2D arguments: [[2, None, 128]]
But I don't get it when using --bidirectionnal_rnn=False
Is this error normal ?
Thank you very much.
What license is this made available under? A popular choice is MIT
Hi, I got this error in https://github.com/HadoopIt/rnn-nlu/blob/master/seq_classification.py#L71
I config the num_layers=2 for MultiRNN, this error occurs.
When I use "num_layers=1" or "num_layers=2 and state_is_tuple=False", the code is fine.
I think there needs some modification for encoder_state, but I'm not quite familiar with rnn in tensorflow.
Would you mind fix this bug?
Thanks a lot
Is there only one model in the codes? The paper proposed two usages about the attention mechanism but i just found out the usage in Encoder-Decoder model. @HadoopIt
First, thanks for your code
when I train the model, I met an error about iteritems and terminal say: "AttributeError: 'dict' object has no attribute 'iteritems' "
How can I solve the problem
thanks
I'm trying to convert and load the model into tensorflow.js
#TODO --output_node_names='model_tmp/checkpoint' \
tensorflowjs_converter \
--input_format=tf_saved_model \
--saved_model_tags=serve \
model_tmp/model.ckpt-500 \
model_web
https://github.com/OpenASR/rnn-nlu/blob/master/scripts/convert
When I run the script above I get an error:
IOError: SavedModel file does not exist at: model_tmp/model.ckpt-500
I've tried model_tmp
, model_tmp/checkpoint
but neither of them work either.
Also, I'm not sure what to provide at output_node_names
Hi guys,
been running a slightly modified version of the code on my (admittedly slow) Macbook Air 2013.
Now I am wondering: is it normal for the declaration of the training ops (tf.train.AdamOptimizer, tf.gradients, tf.clip_by_global_norm, tf.train.AdamOptimizerapply_gradients) to take a combined 11 minutes (or anything in that order of magnitude)? Been downsizing the layer_size to 16 as well, same effect. This effects the development workflow.
Would be thankful for any hints, because testing other parts of the code with this taking so long is very time-consuming.
Best,
Anthony
EDIT:
Could this be caused by the Mac needing to allocate virtual memory space? Because I only have 4gb and the model consumes more than that in my current setup.
Hi @HadoopIt ,
Thank you for publishing the code for the paper.
I am trying to use a stored pre-trained model to generate the intent and slots for a new sentence. However, based on the outputs it generates, it ends up using a new, untrained model.
saver = tf.train.import_meta_graph('/tmp/model.ckpt-1900.meta')
saver.restore(session, '/tmp/model.ckpt-1900')
model_train, model_test = create_model(session, 139, 36, 6)
step_outputs = model_test.joint_step(session, encoder_inputs, tags, tag_weights, labels,sequence_length, bucket_id, True)
Any suggestions on how to use a trained model from a stored file?
global step 29600 step-time 0.09. Training perplexity 1.00
Eval accuracy: 97.21 976/1004
Test accuracy: 96.08 858/893
I did not run the good scores in intent.
I think my dataset maybe has some problems, could you give me your dataset?
Thanks.
email: [email protected]
Hi, thank you for the code.
How to turn on properly a support for buckets of different size?
This model works properly only with 1 bucket size right now.
Sorry for opening this on issue page. However, this is not an issue.
I was expecting the prediction after finishing the training iterations.
Since I have a model, I was wondering how I could predict a given sentence as a sample.
Hi, thanks for the great code.
I tried running your code on the ATIS data in https://github.com/yvchen/JointSLU/tree/master/data, and got accuracy 96.75 and F1 94.42 after training for 8400 steps. (I replaced the digits in the text with digit*n, where n is the length of the digit sequence)
However, there is still a gap between this result and the published results.
My questions are:
Is this result reasonable for the published code?
What else should I do to reproduce the published results, except for implementing the tag dependency? Are there any important tricks?
Should I use the default hyper-parameters in the code, or another set of hyper-parameters?
Thanks a lot.
Use of uninitialized value in printf at /Users/huangpeisong/Desktop/rnn-nlu/conlleval.pl line 229, line 74.
Use of uninitialized value in printf at /Users/huangpeisong/Desktop/rnn-nlu/conlleval.pl line 229, line 74.
docker images: tensorflow/tensorflow:1.2.0
Refer to code .
The logit set shape from itself. The code may be writed as this
logit.set_shape(zero_logit.get_shape())
May some help.
Hi, I think dynamic rnn is more convenient to use, why use static?
@HadoopIt
Due to this change in Tensorflow as the suggestion:
writing:
MultiRNNCell([lstm] * 5)
will now build a 5-layer LSTM stack where each
layer shares the same parameters. To get 5 layers each with their own
parameters, write:MultiRNNCell([LSTMCell(...) for _ in range(5)])
.
Should cell = tf.contrib.rnn.MultiRNNCell([single_cell] * num_layers])
in line be updated to cell = tf.contrib.rnn.MultiRNNCell([single_cell for _ in range(num_layers)])
?
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.