Giter Club home page Giter Club logo

nlc's People

Contributors

avati avatar zxie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nlc's Issues

data url invalid

Hi, the data url _NLC_TRAIN_URL and _NLC_DEV_URL is invalid, and I'm wondering where I could donwload the data. Can anyone helps? Thanks.

Which TF version to use?

I tried to launch this project vie TF 1.8, but found that rnn_cell._linear attribute is unavailable. So i googled until found the advice to replace this class method with tf.contrib.fully_connected layer. Then the next problem arose: how to transform arguments, because fully_connected layer has 14 and there are 4 args w/o keywirds in the code.
I think i succeeded in it, however, I was not able to map the last argument from phi_hs2d = tanh(rnn_cell._linear(hs2d, num_units, True, 1.0)) to any argument in fully_connected init. I suppose that True in the line above means trainable=True

Next problem is the shape mismatch: ValueError: Shapes must be equal rank, but are 2 and 3 for 'NLC/Decoder/DecoderAttnCell/DecoderAttnCell/while/Select' (op: 'Select') with input shapes: [?], [?,400], [2,?,400]. in nlc_model.py:166, so probably the model either can't be ported straightforwardly to the newest version of TF or has the size mismatch error inside it.

Could you help, e. g. provide complete environment where the model can be trained?

error at line ret_vars = tf.while_loop(cond=beam_cond, body=beam_step, loop_vars=loop_vars, back_prop=False) of function setup_beam

When I build the model ,it have the error at line ret_vars = tf.while_loop(cond=beam_cond, body=beam_step, loop_vars=loop_vars, back_prop=False) of function setup_beam. The error is as follows:
ValueError: The shape for NLC/while/Merge_1:0 is not an invariant for the loop. It enters the loop with shape (1,), but has shape (?,) after one iteration. Provide shape invariants using either the shape_invariants argument of tf.while_loop or set_shape() on the loop variables.
Can you fix it ?@avati

Minor detail on decoder's init_state

So I thought decoder would initialize its initial states by taking in the last time steps of the encoder...

In the code, however, this is not happening, since the decoder initializes its initial states with 0s during training...effectively the only Tensorflow link between encoder and decoder is through the attention map. Is this an implementation decision? Or is this standard and more general?

beam search using DynamicAttentionWrapper

Hi, I'm trying to use this beam search code for attentioned to seq2seq model. the type of the decoding cell is <tensorflow.contrib.seq2seq.python.ops.dynamic_attention_wrapper.DynamicAttentionWrapper
and initial state is: DynamicAttentionWrapperState(cell_state=LSTMStateTuple(c=<tf.Tensor 'Test/seq2seq_att/social_bot/tower_0/Decoder/decoder/decoder_1/while/Identity_5:0' shape=(?, 2048) dtype=float32>, h=<tf.Tensor 'Test/seq2seq_att/social_bot/tower_0/Decoder/decoder/decoder_1/while/Identity_6:0' shape=(?, 2048) dtype=float32>), attention=<tf.Tensor 'Test/seq2seq_att/social_bot/tower_0/Decoder/decoder/decoder_1/while/Identity_7:0' shape=(?, 2048) dtype=float32>)

This code seems not work. how should change the code for the new api? Thanks.

no method to create vocab.dat

@zxie The file data.md mentions learn_bpe.py to create a vocab.dat file but there is no way to create a vocabulary currently as that python file doesn't exist leading to the train.py file not running. How are we supposed to create a vocab.dat file currently ?

how to use lang-8 and CoNLL data?

Hi, I've got lang-8 and CoNLL dataset. Any clue about how to input the data into your training scripts?

Would really appreciate if you can show what the data structure is in "nlc-train.tar" and "nlc-valid.tar".

Thanks!

Meet "disturbing performance" when decode sentence

Hello,

I have get the lang-8 data and run your code with 40 epoch, and when I decode sentence, I meet "disturbing performance".

when the input sentence is very short, it woks well:
image

However, when input long sentence from CoNLL2014 test data, it works "distrubing":
image

And I notice you haven't integrate LM into your model.

Best,
Jun

Inputs to forward/backward RNN are the same

According to the current implementation of the bidirectional RNN, it seems like the inputs to the forward and backward RNN are the same.
Shouldn't the input to the backward RNN be reversed?

Can't get the data

Hello,

I run the code and get this : IOError: [Errno socket error] [Errno 113] No route to host.

Could you put the data on google drive or something else, thanks~~

@zxie

Exploding gradients for long running task

Hi, great paper and project.

I was using code from three days ago and I ran the model with default configurations on a GPU machine (TensorFlow 0.8, CUDA 7.5).

Looks like I'm getting exploding gradients after running it long. The first few epochs are OK but it seems to go haywire after that.

Has anyone else experienced this? Thanks.

Below after 16 epochs:

epoch 16, iter 400, cost 3.016297, exp_cost 3.040436, grad norm nan, param norm nan, batch time 2.246108, length mean/std 94.953125/10.900731
epoch 16, iter 500, cost 3.026653, exp_cost 3.040154, grad norm nan, param norm nan, batch time 2.103941, length mean/std 94.718750/8.553086
epoch 16, iter 600, cost 3.021248, exp_cost 3.040449, grad norm nan, param norm nan, batch time 2.602025, length mean/std 107.820312/11.461507
epoch 16, iter 700, cost 3.016261, exp_cost 3.042141, grad norm nan, param norm nan, batch time 1.800595, length mean/std 77.710938/7.695527
epoch 16, iter 800, cost 3.058729, exp_cost 3.041664, grad norm nan, param norm nan, batch time 1.457287, length mean/std 59.484375/7.780167
epoch 16, iter 900, cost 3.045724, exp_cost 3.040437, grad norm nan, param norm nan, batch time 1.573591, length mean/std 56.812500/8.541946
epoch 16, iter 1000, cost 3.041330, exp_cost 3.041670, grad norm nan, param norm nan, batch time 1.729177, length mean/std 75.968750/7.721659
epoch 16, iter 1100, cost 3.051682, exp_cost 3.042871, grad norm nan, param norm nan, batch time 1.512899, length mean/std 66.820312/6.789737

Below at the start of run:

ubuntu@ip-10-99-225-88:~/nlc$ python train.py --data_dir data/lang-8 --train_dir tmp --print_every 100
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:99] Couldn't open CUDA library libcudnn.so. LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/cuda/lib64:
I tensorflow/stream_executor/cuda/cuda_dnn.cc:1562] Unable to load cuDNN DSO
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally
Preparing NLC data in data/lang-8
data/lang-8/char/train.x.txt
data/lang-8/char/train.y.txt
Vocabulary size: 98
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:900] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GRID K520
major: 3 minor: 0 memoryClockRate (GHz) 0.797
pciBusID 0000:00:03.0
Total memory: 4.00GiB
Free memory: 3.95GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:755] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GRID K520, pci bus id: 0000:00:03.0)
Creating 3 layers of 400 units.
Created model with fresh parameters.
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 19611 get requests, put_count=9054 evicted_count=1000 eviction_rate=0.110448 and unsatisfied allocation rate=0.594411
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:256] Raising pool_size_limit_ from 100 to 110
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 4370 get requests, put_count=9377 evicted_count=5000 eviction_rate=0.53322 and unsatisfied allocation rate=0.000915332

...

epoch 1, iter 100, cost 2.011677, exp_cost 3.321748, grad norm 158.576340, param norm 159.629105, batch time 4.148134, length mean/std 145.578125/21.429247
epoch 1, iter 200, cost 1.536272, exp_cost 2.299945, grad norm 44.051079, param norm 164.266922, batch time 1.497179, length mean/std 62.718750/7.484293
epoch 1, iter 300, cost 1.403852, exp_cost 1.776326, grad norm 48.068333, param norm 167.985153, batch time 1.371039, length mean/std 60.421875/7.171569
epoch 1, iter 400, cost 1.307269, exp_cost 1.511883, grad norm 67.238319, param norm 171.190140, batch time 2.659545, length mean/std 113.156250/12.402392
epoch 1, iter 500, cost 1.213914, exp_cost 1.374524, grad norm 16.423653, param norm 177.822784, batch time 0.960466, length mean/std 35.421875/5.773876
epoch 1, iter 600, cost 1.178340, exp_cost 1.284747, grad norm 16.784721, param norm 183.438705, batch time 1.008809, length mean/std 37.742188/6.081794
epoch 1, iter 700, cost 1.132778, exp_cost 1.239199, grad norm 16.578270, param norm 186.132919, batch time 1.086025, length mean/std 44.226562/5.668188
epoch 1, iter 800, cost 1.338967, exp_cost 1.206995, grad norm 15.237223, param norm 189.925400, batch time 0.658998, length mean/std 21.937500/6.582114
epoch 1, iter 900, cost 1.118957, exp_cost 1.177377, grad norm 23.203897, param norm 193.276413, batch time 1.553974, length mean/std 65.296875/7.841555
epoch 1, iter 1000, cost 1.068751, exp_cost 1.159566, grad norm 15.187490, param norm 195.315155, batch time 1.152952, length mean/std 43.359375/6.097815
epoch 1, iter 1100, cost 1.072675, exp_cost 1.147219, grad norm 21.353502, param norm 197.324722, batch time 1.025714, length mean/std 39.304688/5.827680
epoch 1, iter 1200, cost 1.208753, exp_cost 1.129652, grad norm 10.937920, param norm 199.290649, batch time 0.750396, length mean/std 21.367188/6.828103
epoch 1, iter 1300, cost 1.078399, exp_cost 1.117404, grad norm 13.053275, param norm 201.881393, batch time 0.954154, length mean/std 38.406250/6.312732
epoch 1, iter 1400, cost 1.048999, exp_cost 1.108933, grad norm 16.010160, param norm 204.622925, batch time 1.001594, length mean/std 38.859375/6.378350
epoch 1, iter 1500, cost 1.022539, exp_cost 1.102282, grad norm 21.692732, param norm 206.503006, batch time 1.692821, length mean/std 72.132812/7.862461
epoch 1, iter 1600, cost 1.166183, exp_cost 1.092178, grad norm 9.664147, param norm 208.044922, batch time 0.774209, length mean/std 21.156250/7.263390
epoch 1, iter 1700, cost 1.047471, exp_cost 1.086936, grad norm 26.538952, param norm 210.478180, batch time 1.191591, length mean/std 43.960938/7.342516
epoch 1, iter 1800, cost 1.058323, exp_cost 1.076487, grad norm 26.259338, param norm 212.381104, batch time 1.999754, length mean/std 87.062500/8.420353
epoch 1, iter 1900, cost 1.063738, exp_cost 1.067403, grad norm 16.072689, param norm 215.532761, batch time 1.168823, length mean/std 48.640625/6.350116
epoch 1, iter 2000, cost 0.983289, exp_cost 1.057265, grad norm 19.078365, param norm 217.736755, batch time 1.095559, length mean/std 38.804688/5.902937

Is downscale still used?

def downscale(self, inp, mask):
    return inp, mask

    with vs.variable_scope("Downscale"):

Does it mean that contrary to the original paper, downscale is no longer performed in the model?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.