cheng6076 / snli-attention Goto Github PK
View Code? Open in Web Editor NEWSNLI with word-word attention by LSTM encoder-decoder
SNLI with word-word attention by LSTM encoder-decoder
I get an error when I try running LSTMN.lua
/root/torch/install/bin/luajit: /root/torch/install/share/lua/5.1/nn/ClassNLLCriterion.lua:44: Assertion `cur_target >= 0 && cur_target < n_classes' failed. at /root/nn/generic/ClassNLLCriterion.c:52
stack traceback:
[C]: in function 'ClassNLLCriterion_updateOutput'
/root/torch/install/share/lua/5.1/nn/ClassNLLCriterion.lua:44: in function 'forward'
LSTMN.lua:203: in function 'opfunc'
/root/torch/install/share/lua/5.1/optim/adam.lua:37: in function 'adam'
LSTMN.lua:267: in main chunk
[C]: in function 'dofile'
/root/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x004064f0
I follow the setup and usage instruction. When I try running LSTMN.lua, I get an error about not finding rnn module. I install rnn (using luarocks) and run LSTMN.lua again. It runs for a while and then gives the error as shown above.
Note: I use glove.840B.300d as word2vec.txt
Hi, When I use it on mac os. It couldn't work.
Processing text into tensors...
/Users/xxxxx/torch/install/bin/luajit: ./util/BatchLoaderC.lua:81: attempt to index local 'f' (a nil value)
stack traceback:
./util/BatchLoaderC.lua:81: in function 'text_to_tensor'
./util/BatchLoaderC.lua:15: in function 'create'
LSTMN.lua:62: in main chunk
[C]: in function 'dofile'
...owei/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:131: in main chunk
[C]: at 0x010580e2d0
I run this experiments following instructions several times but only get accuracy around 0.800, should I do some changes in the codes. It seems the codes correctly run.
did the following prints means the accracy ?
evaluating loss over split index 3
test_loss = 0.7915
Hi, i found that you didn't use the previous intra alignment vector to computer the intra attention score in the encoder as you did in the decoder.
Does it matter?
andy1028@andy1028-Envy:/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn$ luarocks make rocks/*
Warning: unmatched variable LUALIB
cmake -E make_directory build && cd build && cmake .. -DLUALIB= -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH="/home/andy1028/torch/install/bin/.." -DCMAKE_INSTALL_PREFIX="/home/andy1028/torch/install/lib/luarocks/rocks/cunn/scm-1" && make -j$(getconf _NPROCESSORS_ONLN) install
-- Found Torch7 in /home/andy1028/torch/install
-- Compiling for CUDA architecture: 5.0
-- Configuring done
-- Generating done
-- Build files have been written to: /media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/build
[ 2%] [ 7%] [ 7%] [ 12%] [ 12%] [ 19%] [ 19%] [ 19%] Building NVCC (Device) object CMakeFiles/cunn.dir//./cunn_generated_VolumetricMaxPooling.cu.o
Building NVCC (Device) object CMakeFiles/cunn.dir//./cunn_generated_LogSoftMax.cu.o
Building NVCC (Device) object CMakeFiles/cunn.dir//./cunn_generated_SpatialFractionalMaxPooling.cu.o
Building NVCC (Device) object CMakeFiles/cunn.dir//./cunn_generated_Abs.cu.o
Building NVCC (Device) object CMakeFiles/cunn.dir//./cunn_generated_AbsCriterion.cu.o
Building NVCC (Device) object CMakeFiles/cunn.dir//./cunn_generated_ClassNLLCriterion.cu.o
Building NVCC (Device) object CMakeFiles/cunn.dir//./cunn_generated_ELU.cu.o
Building NVCC (Device) object CMakeFiles/cunn.dir//./cunn_generated_DistKLDivCriterion.cu.o
/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/VolumetricMaxPooling.cu(26): error: identifier "THInf" is undefined
/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/VolumetricMaxPooling.cu(77): error: identifier "THInf" is undefined
2 errors detected in the compilation of "/tmp/tmpxft_00004c64_00000000-7_VolumetricMaxPooling.cpp1.ii".
CMake Error at cunn_generated_VolumetricMaxPooling.cu.o.cmake:267 (message):
Error generating file
/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/build/CMakeFiles/cunn.dir//./cunn_generated_VolumetricMaxPooling.cu.o
CMakeFiles/cunn.dir/build.make:5405: recipe for target 'CMakeFiles/cunn.dir/./cunn_generated_VolumetricMaxPooling.cu.o' failed
make[2]: *** [CMakeFiles/cunn.dir/./cunn_generated_VolumetricMaxPooling.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....
/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/ELU.cu(24): error: identifier "THCudaTensor_pointwiseApply2" is undefined
/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/ELU.cu(49): error: identifier "THCudaTensor_pointwiseApply3" is undefined
/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/Abs.cu(19): error: identifier "THCudaTensor_pointwiseApply2" is undefined
/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/Abs.cu(39): error: identifier "THCudaTensor_pointwiseApply3" is undefined
/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/SpatialFractionalMaxPooling.cu(44): error: identifier "THInf" is undefined
2 errors detected in the compilation of "/tmp/tmpxft_00004c75_00000000-7_ELU.cpp1.ii".
1 error detected in the compilation of "/tmp/tmpxft_00004c6a_00000000-7_SpatialFractionalMaxPooling.cpp1.ii".
2 errors detected in the compilation of "/tmp/tmpxft_00004c6d_00000000-7_Abs.cpp1.ii".
CMake Error at cunn_generated_ELU.cu.o.cmake:267 (message):
Error generating file
/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/build/CMakeFiles/cunn.dir//./cunn_generated_ELU.cu.o
CMakeFiles/cunn.dir/build.make:2299: recipe for target 'CMakeFiles/cunn.dir/./cunn_generated_ELU.cu.o' failed
make[2]: *** [CMakeFiles/cunn.dir/./cunn_generated_ELU.cu.o] Error 1
CMake Error at cunn_generated_SpatialFractionalMaxPooling.cu.o.cmake:267 (message):
Error generating file
/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/build/CMakeFiles/cunn.dir//./cunn_generated_SpatialFractionalMaxPooling.cu.o
CMakeFiles/cunn.dir/build.make:444: recipe for target 'CMakeFiles/cunn.dir/./cunn_generated_SpatialFractionalMaxPooling.cu.o' failed
make[2]: *** [CMakeFiles/cunn.dir/./cunn_generated_SpatialFractionalMaxPooling.cu.o] Error 1
CMake Error at cunn_generated_Abs.cu.o.cmake:267 (message):
Error generating file
/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn/build/CMakeFiles/cunn.dir//./cunn_generated_Abs.cu.o
CMakeFiles/cunn.dir/build.make:644: recipe for target 'CMakeFiles/cunn.dir/./cunn_generated_Abs.cu.o' failed
make[2]: *** [CMakeFiles/cunn.dir/./cunn_generated_Abs.cu.o] Error 1
CMakeFiles/Makefile2:60: recipe for target 'CMakeFiles/cunn.dir/all' failed
make[1]: *** [CMakeFiles/cunn.dir/all] Error 2
Makefile:117: recipe for target 'all' failed
make: *** [all] Error 2
Error: Build error: Failed building.
andy1028@andy1028-Envy:/media/andy1028/data1t/os_prj/github/SNLI-attention/cunn$
/opt/torch/install/bin/luajit: ./model/encoder_lstmn_w2v.lua:38: attempt to call field 'AddScalar' (a nil value)
stack traceback:
./model/encoder_lstmn_w2v.lua:38: in function 'lstmn'
LSTMN.lua:70: in main chunk
[C]: in function 'dofile'
/opt/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x004064d0
Does 'AddScalar' do the same as 'ReplicateAdd' in the previous version?
Just curious if you happen to know any TensorFlow LSTMN implementations.
Hello, I am trying to run this code. I am unable to downgrade cunn because I have latest cutorch. I need to downgrade cutorch as well as said here. To avoid trying binary search on building at various commits, can you tell the cutorch version(commit id) with which you ran the code.
Please choose a license and add a LICENCE.md file (I also suggest adding a COPYRIGHT.md file) and linking to it from README.md.
I'm getting an error when running th LSTMN.lua -gpuid 0
.
$ th LSTMN.lua -gpuid 0
using CUDA on GPU 0...
Processing text into tensors...
Token count: train 550152, val 10000, test 10000
Word vocab size: 61590
data load done. Number of batches in train: 34384, val: 625, test: 625
number of parameters in the model: 39483906
cloning dec
cloning enc
install/bin/luajit: ./util/LookupTableEmbedding_train.lua:55: attempt to call field 'LookupTable_accGradParameters' (a nil value)
stack traceback:
./util/LookupTableEmbedding_train.lua:55: in function 'accGradParameters'
...distro/install/share/lua/5.1/nngraph/gmodule.lua:409: in function 'neteval'
...distro/install/share/lua/5.1/nngraph/gmodule.lua:420: in function 'accGradParameters'
distro/install/share/lua/5.1/nn/Module.lua:31: in function 'backward'
LSTMN.lua:208: in function 'opfunc'
...distro/install/share/lua/5.1/optim/adam.lua:33: in function 'adam'
LSTMN.lua:251: in main chunk
[C]: in function 'dofile'
...distro/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00406670
To get data files, I followed these steps:
downloaded SNLI from http://nlp.stanford.edu/projects/snli/. Generated train.txt, test.txt and dev.txt with format label(0, 1, 2) \t sentence1 \t sentence2 \n
(where 0: entailment, 1: contradiction, 2: neutral)
downloaded word2vec from https://code.google.com/archive/p/word2vec/ (Google News 300d pretrained). Tweaked this gist to produce a space-separated file of word d1 d2 d3...d300\n
Thank you very much for updating :D . But, one last problem. Can you tell me how much memory is needed for this. I have 4GB GTX 960M but it throws out of memory error.
Thanks for sharing the research code along with the paper. I am reading the code in parallel with the paper and I think it is very useful for understanding details that are not explicitly mentioned in the paper. I am not sure if this is the right place to ask but I have two questions regarding the implementation.
{rnn_alpha, rnn_h_dec}
as input. I don't quite understand why rnn_alpha
is part of the input. Shouldn't the input be the hidden state vectors for the source and target sequences, i.e. {rnn_h_enc, rnn_h_dec}
?drnn_c_dec[1]
and drnn_h_dec[1]
to the gradients drnn_c_enc[max_length+1]
and drnn_h_enc[max_length+1]
on lines 222-223 of SNLI-attention/LSTMN.lua? After reading the paper and the rest of the implementation, I have the impression that the initial hidden state and memory vectors of the decoder are random vectors and they don't depend on the final hidden state and memory vectors of the encoder.in line 153
local max,indice = prediction:max(2) -- indice is a 2d tensor here, we need to flatten it... if indice[1][1] == label[1] then correct_count = correct_count + 1 end
It seems only check the first result of a batch. why not check them all?
/opt/torch/install/bin/luajit: LSTMN.lua:301: attempt to index global 'enc_lookup' (a nil value)
stack traceback:
LSTMN.lua:301: in main chunk
[C]: in function 'dofile'
/opt/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x004064d0
Hey there,
I am getting an error when running main.lua. My dev/test/train and word2vec txt files seems good. I can't figure out to fix the following error:
tput: No value for $TERM and no -T specified
/idiap/user/lmiculicich/Installations/torch/install/bin/luajit: ./util/BatchLoaderB.lua:34: bad argument #2 to 'sub' (out of range at /idiap/user/lmiculicich/Installations/torch/pkg/torch/generic/Tensor.c:304)
stack traceback:
[C]: in function 'sub'
./util/BatchLoaderB.lua:34: in function 'create'
/idiap/temp/jpilault/Idiap_project/SNLI-attention/main.lua:63: in main chunk
[C]: in function 'dofile'
...ions/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00405d50
Trying to load a model.t7 but I am getting a:
model = torch.load('model.t7')
I am getting the following error
...Installations/torch/install/share/lua/5.1/torch/File.lua:343: unknown Torch class <nn.AddScalar>
local AddScalar = torch.class('nn.AddScalar')
works fine
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.