cheng6076 / snli-attention Goto Github PK

SNLI with word-word attention by LSTM encoder-decoder

Lua 98.39% Python 1.61%

snli-attention's Introduction

SNLI task with LSTM Memory Network encoder-dencoder and neural attention

This is an implementation for the deep attention fusion LSTM memory network presented in the paper "Long Short-Term Memory Networks for Machine Reading".

Setup and Usage

This code requires Torch7 and nngraph. It is updated to use torch version around May 2016. Minimum preprocessing is needed to obtain a good accuracy, including lower-casing and tokenization.

Citation

@article{cheng2016,
  author = {Cheng, Jianpeng and Dong, Li and Lapata, Mirella,
  title = {Long Short-Term Memory Networks for Machine Reading},
  journal = {EMNLP},
  year = {2016},
  pages = {551--562}
}

snli-attention's People

Contributors

Stargazers

Watchers

snli-attention's Issues

BatchLoaderB.lua:34: bad argument #2 to 'sub'

Hey there,

I am getting an error when running main.lua. My dev/test/train and word2vec txt files seems good. I can't figure out to fix the following error:

tput: No value for $TERM and no -T specified
/idiap/user/lmiculicich/Installations/torch/install/bin/luajit: ./util/BatchLoaderB.lua:34: bad argument #2 to 'sub' (out of range at /idiap/user/lmiculicich/Installations/torch/pkg/torch/generic/Tensor.c:304)
stack traceback:
[C]: in function 'sub'
./util/BatchLoaderB.lua:34: in function 'create'
/idiap/temp/jpilault/Idiap_project/SNLI-attention/main.lua:63: in main chunk
[C]: in function 'dofile'
...ions/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00405d50

about eval_split function

in line 153
local max,indice = prediction:max(2) -- indice is a 2d tensor here, we need to flatten it... if indice[1][1] == label[1] then correct_count = correct_count + 1 end
It seems only check the first result of a batch. why not check them all?

Need to downgrade cutorch as well.

Hello, I am trying to run this code. I am unable to downgrade cunn because I have latest cutorch. I need to downgrade cutorch as well as said here. To avoid trying binary search on building at various commits, can you tell the cutorch version(commit id) with which you ran the code.

previous alignment vector in encoder

Hi, i found that you didn't use the previous intra alignment vector to computer the intra attention score in the encoder as you did in the decoder.
Does it matter?

Trying to load a saved model but getting an unknown Torch class <nn.AddScalar>

Trying to load a model.t7 but I am getting a:

model = torch.load('model.t7')

I am getting the following error
...Installations/torch/install/share/lua/5.1/torch/File.lua:343: unknown Torch class <nn.AddScalar>

local AddScalar = torch.class('nn.AddScalar')

works fine

Add support for restarting from a checkpointed model

Out of memory error

Thank you very much for updating :D . But, one last problem. Can you tell me how much memory is needed for this. I have 4GB GTX 960M but it throws out of memory error.

Questions about the implementation

Thanks for sharing the research code along with the paper. I am reading the code in parallel with the paper and I think it is very useful for understanding details that are not explicitly mentioned in the paper. I am not sure if this is the right place to ask but I have two questions regarding the implementation.

The classifier backprop call on line 193 of SNLI-attention/LSTMN.lua is taking {rnn_alpha, rnn_h_dec} as input. I don't quite understand why rnn_alpha is part of the input. Shouldn't the input be the hidden state vectors for the source and target sequences, i.e. {rnn_h_enc, rnn_h_dec}?
Why do we add the gradients drnn_c_dec[1] and drnn_h_dec[1] to the gradients drnn_c_enc[max_length+1] and drnn_h_enc[max_length+1] on lines 222-223 of SNLI-attention/LSTMN.lua? After reading the paper and the rest of the implementation, I have the impression that the initial hidden state and memory vectors of the decoder are random vectors and they don't depend on the final hidden state and memory vectors of the encoder.

No license on the repository

Please choose a license and add a LICENCE.md file (I also suggest adding a COPYRIGHT.md file) and linking to it from README.md.

Offtopic: Are there any TensorFlow implementations of LSTMNs?

Just curious if you happen to know any TensorFlow LSTMN implementations.

error running LSTMN.lua

I get an error when I try running LSTMN.lua
/root/torch/install/bin/luajit: /root/torch/install/share/lua/5.1/nn/ClassNLLCriterion.lua:44: Assertion `cur_target >= 0 && cur_target < n_classes' failed. at /root/nn/generic/ClassNLLCriterion.c:52
stack traceback:
[C]: in function 'ClassNLLCriterion_updateOutput'
/root/torch/install/share/lua/5.1/nn/ClassNLLCriterion.lua:44: in function 'forward'
LSTMN.lua:203: in function 'opfunc'
/root/torch/install/share/lua/5.1/optim/adam.lua:37: in function 'adam'
LSTMN.lua:267: in main chunk
[C]: in function 'dofile'
/root/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x004064f0

I follow the setup and usage instruction. When I try running LSTMN.lua, I get an error about not finding rnn module. I install rnn (using luarocks) and run LSTMN.lua again. It runs for a while and then gives the error as shown above.
Note: I use glove.840B.300d as word2vec.txt

Mac os couldn't run

Hi, When I use it on mac os. It couldn't work.

Processing text into tensors...
/Users/xxxxx/torch/install/bin/luajit: ./util/BatchLoaderC.lua:81: attempt to index local 'f' (a nil value)
stack traceback:
./util/BatchLoaderC.lua:81: in function 'text_to_tensor'
./util/BatchLoaderC.lua:15: in function 'create'
LSTMN.lua:62: in main chunk
[C]: in function 'dofile'
...owei/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:131: in main chunk
[C]: at 0x010580e2d0

enc_lookup

/opt/torch/install/bin/luajit: LSTMN.lua:301: attempt to index global 'enc_lookup' (a nil value)
stack traceback:
LSTMN.lua:301: in main chunk
[C]: in function 'dofile'
/opt/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x004064d0

build failed at git checkout 3f5a8ba2bd4e6babf112d6369c98f37be86d2391

andy1028@andy1028-Envy:/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn$ luarocks make rocks/*
Warning: unmatched variable LUALIB
cmake -E make_directory build && cd build && cmake .. -DLUALIB= -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH="/home/andy1028/torch/install/bin/.." -DCMAKE_INSTALL_PREFIX="/home/andy1028/torch/install/lib/luarocks/rocks/cunn/scm-1" && make -j$(getconf _NPROCESSORS_ONLN) install

-- Found Torch7 in /home/andy1028/torch/install
-- Compiling for CUDA architecture: 5.0
-- Configuring done
-- Generating done
-- Build files have been written to: /media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/build
[ 2%] [ 7%] [ 7%] [ 12%] [ 12%] [ 19%] [ 19%] [ 19%] Building NVCC (Device) object CMakeFiles/cunn.dir//./cunn_generated_VolumetricMaxPooling.cu.o
Building NVCC (Device) object CMakeFiles/cunn.dir//./cunn_generated_LogSoftMax.cu.o
Building NVCC (Device) object CMakeFiles/cunn.dir//./cunn_generated_SpatialFractionalMaxPooling.cu.o
Building NVCC (Device) object CMakeFiles/cunn.dir//./cunn_generated_Abs.cu.o
Building NVCC (Device) object CMakeFiles/cunn.dir//./cunn_generated_AbsCriterion.cu.o
Building NVCC (Device) object CMakeFiles/cunn.dir//./cunn_generated_ClassNLLCriterion.cu.o
Building NVCC (Device) object CMakeFiles/cunn.dir//./cunn_generated_ELU.cu.o
Building NVCC (Device) object CMakeFiles/cunn.dir//./cunn_generated_DistKLDivCriterion.cu.o
/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/VolumetricMaxPooling.cu(26): error: identifier "THInf" is undefined

/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/VolumetricMaxPooling.cu(77): error: identifier "THInf" is undefined

2 errors detected in the compilation of "/tmp/tmpxft_00004c64_00000000-7_VolumetricMaxPooling.cpp1.ii".
CMake Error at cunn_generated_VolumetricMaxPooling.cu.o.cmake:267 (message):
Error generating file
/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/build/CMakeFiles/cunn.dir//./cunn_generated_VolumetricMaxPooling.cu.o

CMakeFiles/cunn.dir/build.make:5405: recipe for target 'CMakeFiles/cunn.dir/./cunn_generated_VolumetricMaxPooling.cu.o' failed
make[2]: *** [CMakeFiles/cunn.dir/./cunn_generated_VolumetricMaxPooling.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....
/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/ELU.cu(24): error: identifier "THCudaTensor_pointwiseApply2" is undefined

/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/ELU.cu(49): error: identifier "THCudaTensor_pointwiseApply3" is undefined

/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/Abs.cu(19): error: identifier "THCudaTensor_pointwiseApply2" is undefined

/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/Abs.cu(39): error: identifier "THCudaTensor_pointwiseApply3" is undefined

/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/SpatialFractionalMaxPooling.cu(44): error: identifier "THInf" is undefined

2 errors detected in the compilation of "/tmp/tmpxft_00004c75_00000000-7_ELU.cpp1.ii".
1 error detected in the compilation of "/tmp/tmpxft_00004c6a_00000000-7_SpatialFractionalMaxPooling.cpp1.ii".
2 errors detected in the compilation of "/tmp/tmpxft_00004c6d_00000000-7_Abs.cpp1.ii".
CMake Error at cunn_generated_ELU.cu.o.cmake:267 (message):
Error generating file
/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/build/CMakeFiles/cunn.dir//./cunn_generated_ELU.cu.o

CMakeFiles/cunn.dir/build.make:2299: recipe for target 'CMakeFiles/cunn.dir/./cunn_generated_ELU.cu.o' failed
make[2]: *** [CMakeFiles/cunn.dir/./cunn_generated_ELU.cu.o] Error 1
CMake Error at cunn_generated_SpatialFractionalMaxPooling.cu.o.cmake:267 (message):
Error generating file
/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/build/CMakeFiles/cunn.dir//./cunn_generated_SpatialFractionalMaxPooling.cu.o

CMakeFiles/cunn.dir/build.make:444: recipe for target 'CMakeFiles/cunn.dir/./cunn_generated_SpatialFractionalMaxPooling.cu.o' failed
make[2]: *** [CMakeFiles/cunn.dir/./cunn_generated_SpatialFractionalMaxPooling.cu.o] Error 1
CMake Error at cunn_generated_Abs.cu.o.cmake:267 (message):
Error generating file
/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/build/CMakeFiles/cunn.dir//./cunn_generated_Abs.cu.o

CMakeFiles/cunn.dir/build.make:644: recipe for target 'CMakeFiles/cunn.dir/./cunn_generated_Abs.cu.o' failed
make[2]: *** [CMakeFiles/cunn.dir/./cunn_generated_Abs.cu.o] Error 1
CMakeFiles/Makefile2:60: recipe for target 'CMakeFiles/cunn.dir/all' failed
make[1]: *** [CMakeFiles/cunn.dir/all] Error 2
Makefile:117: recipe for target 'all' failed
make: *** [all] Error 2

Error: Build error: Failed building.
andy1028@andy1028-Envy:/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn$

Attempt to call field 'LookupTable_accGradParameters' (a nil value)

I'm getting an error when running th LSTMN.lua -gpuid 0.

$ th LSTMN.lua -gpuid 0
using CUDA on GPU 0...
Processing text into tensors...
Token count: train 550152, val 10000, test 10000
Word vocab size: 61590
data load done. Number of batches in train: 34384, val: 625, test: 625
number of parameters in the model: 39483906
cloning dec
cloning enc
install/bin/luajit: ./util/LookupTableEmbedding_train.lua:55: attempt to call field 'LookupTable_accGradParameters' (a nil value)
stack traceback:
        ./util/LookupTableEmbedding_train.lua:55: in function 'accGradParameters'
        ...distro/install/share/lua/5.1/nngraph/gmodule.lua:409: in function 'neteval'
        ...distro/install/share/lua/5.1/nngraph/gmodule.lua:420: in function 'accGradParameters'
        distro/install/share/lua/5.1/nn/Module.lua:31: in function 'backward'
        LSTMN.lua:208: in function 'opfunc'
        ...distro/install/share/lua/5.1/optim/adam.lua:33: in function 'adam'
        LSTMN.lua:251: in main chunk
        [C]: in function 'dofile'
        ...distro/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
        [C]: at 0x00406670

To get data files, I followed these steps:

downloaded SNLI from http://nlp.stanford.edu/projects/snli/. Generated train.txt, test.txt and dev.txt with format label(0, 1, 2) \t sentence1 \t sentence2 \n (where 0: entailment, 1: contradiction, 2: neutral)
downloaded word2vec from https://code.google.com/archive/p/word2vec/ (Google News 300d pretrained). Tweaked this gist to produce a space-separated file of word d1 d2 d3...d300\n

why I can't get similar results in the paper

I run this experiments following instructions several times but only get accuracy around 0.800, should I do some changes in the codes. It seems the codes correctly run.

did the following prints means the accracy ?
evaluating loss over split index 3
test_loss = 0.7915

AddScalar

/opt/torch/install/bin/luajit: ./model/encoder_lstmn_w2v.lua:38: attempt to call field 'AddScalar' (a nil value)
stack traceback:
./model/encoder_lstmn_w2v.lua:38: in function 'lstmn'
LSTMN.lua:70: in main chunk
[C]: in function 'dofile'
/opt/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x004064d0

Does 'AddScalar' do the same as 'ReplicateAdd' in the previous version?