Giter Club home page Giter Club logo

rnn-nlu's People

Contributors

elias-1 avatar hadoopit avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rnn-nlu's Issues

how to impove predict result?

HI, i used your code and train a model. When predict with my test data, intent result seems good, but tagging task seems worse comparing to other mate.
I changed some flags parameters, such as double 'batch_size',double 'word_embedding_size', double 'max_training_steps',double 'num_layers',no use. Can you give me other tip? :)

thanks!

Linear is expecting 2D arguments: [[2, None, 128]] with --bidirectionnal_rnn=True argument

Hi,

When using --bidirectionnal_rnn=True argument, I get the following output :

/usr/bin/python2.7 /home/pldelisl/Downloads/rnn-nlu-master/run_multi-task_rnn.py --data_dir=/home/pldelisl/Downloads/rnn-nlu-master/data/ATIS_samples --max_sequence_length=50 --task=joint --bidirectional_rnn=True --train_dir=model_tmp
Applying Parameters:
word_embedding_size: 128
task: joint
data_dir: /home/pldelisl/Downloads/rnn-nlu-master/data/ATIS_samples
in_vocab_size: 10000
dropout_keep_prob: 0.5
train_dir: model_tmp
num_layers: 1
max_gradient_norm: 5.0
batch_size: 16
out_vocab_size: 10000
use_attention: True
max_sequence_length: 50
bidirectional_rnn: True
steps_per_checkpoint: 300
max_train_data_size: 0
max_training_steps: 10000
max_test_data_size: 0
size: 128
Preparing data in /home/pldelisl/Downloads/rnn-nlu-master/data/ATIS_samples
Max sequence length: 50.
Creating 1 layers of 128 units.
Use the attention RNN model
Traceback (most recent call last):
  File "/home/pldelisl/Downloads/rnn-nlu-master/run_multi-task_rnn.py", line 356, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))
  File "/home/pldelisl/Downloads/rnn-nlu-master/run_multi-task_rnn.py", line 353, in main
    train()
  File "/home/pldelisl/Downloads/rnn-nlu-master/run_multi-task_rnn.py", line 230, in train
    model, model_test = create_model(sess, len(vocab), len(tag_vocab), len(label_vocab))
  File "/home/pldelisl/Downloads/rnn-nlu-master/run_multi-task_rnn.py", line 183, in create_model
    task=task)
  File "/home/pldelisl/Downloads/rnn-nlu-master/multi_task_model.py", line 89, in __init__
    buckets, softmax_loss_function=softmax_loss_function, use_attention=use_attention)
  File "/home/pldelisl/Downloads/rnn-nlu-master/seq_labeling.py", line 269, in generate_sequence_output
    use_attention=use_attention)
  File "/home/pldelisl/Downloads/rnn-nlu-master/seq_labeling.py", line 121, in attention_RNN
    initial_state = rnn_cell._linear(encoder_state, output_size, True)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn_cell.py", line 892, in _linear
    raise ValueError("Linear is expecting 2D arguments: %s" % str(shapes))
ValueError: Linear is expecting 2D arguments: [[2, None, 128]]

But I don't get it when using --bidirectionnal_rnn=False

Is this error normal ?

Thank you very much.

License

What license is this made available under? A popular choice is MIT

About the models.

Is there only one model in the codes? The paper proposed two usages about the attention mechanism but i just found out the usage in Encoder-Decoder model. @HadoopIt

some error about code

First, thanks for your code
when I train the model, I met an error about iteritems and terminal say: "AttributeError: 'dict' object has no attribute 'iteritems' "
How can I solve the problem
thanks

load into tensorflow.js

I'm trying to convert and load the model into tensorflow.js

#TODO --output_node_names='model_tmp/checkpoint' \
tensorflowjs_converter \
    --input_format=tf_saved_model \
    --saved_model_tags=serve \
    model_tmp/model.ckpt-500 \
    model_web

https://github.com/OpenASR/rnn-nlu/blob/master/scripts/convert

When I run the script above I get an error:

IOError: SavedModel file does not exist at: model_tmp/model.ckpt-500

I've tried model_tmp, model_tmp/checkpoint but neither of them work either.

Also, I'm not sure what to provide at output_node_names

https://github.com/tensorflow/tfjs-converter

Slow spinup time

Hi guys,

been running a slightly modified version of the code on my (admittedly slow) Macbook Air 2013.

Now I am wondering: is it normal for the declaration of the training ops (tf.train.AdamOptimizer, tf.gradients, tf.clip_by_global_norm, tf.train.AdamOptimizerapply_gradients) to take a combined 11 minutes (or anything in that order of magnitude)? Been downsizing the layer_size to 16 as well, same effect. This effects the development workflow.

Would be thankful for any hints, because testing other parts of the code with this taking so long is very time-consuming.

Best,
Anthony

EDIT:
Could this be caused by the Mac needing to allocate virtual memory space? Because I only have 4gb and the model consumes more than that in my current setup.

Restoring model from checkpoint

Hi @HadoopIt ,

Thank you for publishing the code for the paper.
I am trying to use a stored pre-trained model to generate the intent and slots for a new sentence. However, based on the outputs it generates, it ends up using a new, untrained model.

saver = tf.train.import_meta_graph('/tmp/model.ckpt-1900.meta')
saver.restore(session, '/tmp/model.ckpt-1900')

model_train, model_test = create_model(session, 139, 36, 6)
step_outputs  = model_test.joint_step(session, encoder_inputs, tags, tag_weights, labels,sequence_length, bucket_id, True)

Any suggestions on how to use a trained model from a stored file?

need the dataset

global step 29600 step-time 0.09. Training perplexity 1.00
Eval accuracy: 97.21 976/1004
Test accuracy: 96.08 858/893

I did not run the good scores in intent.
I think my dataset maybe has some problems, could you give me your dataset?
Thanks.
email: [email protected]

How to turn on proper bucket support?

Hi, thank you for the code.
How to turn on properly a support for buckets of different size?
This model works properly only with 1 bucket size right now.

May I Expect to see how I do prediction?

Sorry for opening this on issue page. However, this is not an issue.
I was expecting the prediction after finishing the training iterations.
Since I have a model, I was wondering how I could predict a given sentence as a sample.

How to reproduce the results of the paper?

Hi, thanks for the great code.

I tried running your code on the ATIS data in https://github.com/yvchen/JointSLU/tree/master/data, and got accuracy 96.75 and F1 94.42 after training for 8400 steps. (I replaced the digits in the text with digit*n, where n is the length of the digit sequence)
However, there is still a gap between this result and the published results.

My questions are:
Is this result reasonable for the published code?
What else should I do to reproduce the published results, except for implementing the tag dependency? Are there any important tricks?
Should I use the default hyper-parameters in the code, or another set of hyper-parameters?

Thanks a lot.

when i run it in docker, it throws an err.

Use of uninitialized value in printf at /Users/huangpeisong/Desktop/rnn-nlu/conlleval.pl line 229, line 74.
Use of uninitialized value in printf at /Users/huangpeisong/Desktop/rnn-nlu/conlleval.pl line 229, line 74.

docker images: tensorflow/tensorflow:1.2.0

Update with the recent changes in Tensorflow

Due to this change in Tensorflow as the suggestion:

writing: MultiRNNCell([lstm] * 5) will now build a 5-layer LSTM stack where each
layer shares the same parameters. To get 5 layers each with their own
parameters, write: MultiRNNCell([LSTMCell(...) for _ in range(5)]).

Should cell = tf.contrib.rnn.MultiRNNCell([single_cell] * num_layers]) in line be updated to cell = tf.contrib.rnn.MultiRNNCell([single_cell for _ in range(num_layers)])?
Thanks!

I can't find any code about the [red line] in figure3.

11111111111111111949

Excuse me!
I have read the paper, and find the current predicted tag was pointed to the next step as described in Figure3.

But in this code, I cant find any code about this operation.

Is your picture wrong?

Looking forward your reply!
Thank you very much!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.