coetaur0 / esim Goto Github PK
View Code? Open in Web Editor NEWImplementation of the ESIM model for natural language inference with PyTorch
License: Apache License 2.0
Implementation of the ESIM model for natural language inference with PyTorch
License: Apache License 2.0
-sorry, stupid question, found it myself.-
If I try to run train_snli.py, an error occurs:
Traceback (most recent call last):
File "train_snli.py", line 19, in
from utils import train, validate
ImportError: cannot import name 'train'
This seems justified, because esim/esim/utils.py do not include train and validate.
Is this call referring to another script? Or what shall be done?
==================== Training ESIM model on device: cuda:0 ====================
Dear author,
When I was running train_mnli.py and train_snli.py, I met the same device problem with #15. Then I set the device of idx_range (line 40-41 in esim/utils.py) to the correct device and it was solved.
(I don't know whether this change leads to the following problem so I list the change here.)
But I met a new problem, i.e., the training loss is not reduced and accuracy is not improved during training.
Could you please help me on this? Thanks a lot!
What is the Breaking NLI (BNLI) dataset? And where can I download the BNLI dataset? I cannot find the BNLI dataset by searching "Breaking NLI (BNLI)" or "Breaking NLI (BNLI) dataset" in Google.
Thank you very much!
Hi,
For research purposes, I am attempting to train the ESIM model on the SNLI dataset, and then evaluate the classifier on the MNLI dataset (e.g. on the dev set). I tried to do this by calling the test_snli.py script and give it as a test-set the dev set for the (matched) MNLI dataset (after preprocessing). I get the following error:
Traceback (most recent call last):
File "test_snli.py", line 132, in
args.batch_size)
File "test_snli.py", line 113, in main
batch_time, total_time, accuracy = test(model, test_loader)
File "test_snli.py", line 54, in test
hypotheses_lengths)
File "/specific/netapp5/joberant/nlp_fall_2020/liaderez/nlp_project/ESIM/esim_env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/lib/python3.7/site-packages/esim/model.py", line 128, in forward
premises_lengths)
File "/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/lib/python3.7/site-packages/esim/layers.py", line 117, in forward
batch_first=True)
File "/lib/python3.7/site-packages/torch/nn/utils/rnn.py", line 223, in pack_padded_sequence
lengths = torch.as_tensor(lengths, dtype=torch.int64)
RuntimeError: CUDA error: device-side assert triggered
I looked at both datasets and it seems the data is formatted in the same way, do I do not know why there is an issue.
Does the code perhaps support training a model on some dataset and testing it on another? If not, any suggestions about how I should do it?
Thanks!
Hello, genius, I just want to know where is the worddict.pkl when I run 'preprocess_bnli.py --config ...bnli_preprocessing.json'.
File "preprocess_bnli.py", line 73, in preprocess_BNLI_data
with open(worddict, 'rb') as pkl:
FileNotFoundError: [Errno 2] No such file or directory: '../data/preprocessed/SNLI/worddict.pkl'
Throughout my first 7 epochs, loss is always lower on validation set rather than training set. Did anything go wrong or the validation set is just too easy?
`* Training epoch 2:
Avg. batch proc. time: 0.0477s, loss: 0.4858: 100%|█████████████████████████████████████████████████████████| 17168/17168 [14:15<00:00, 20.07it/s]
-> Training time: 855.3887s, loss = 0.4858, accuracy: 81.1468%
Validation for epoch 2:
-> Valid. time: 3.0217s, loss: 0.3845, accuracy: 85.4095%
Training epoch 3:
Avg. batch proc. time: 0.0476s, loss: 0.4385: 100%|█████████████████████████████████████████████████████████| 17168/17168 [14:13<00:00, 20.12it/s]
-> Training time: 853.2559s, loss = 0.4385, accuracy: 83.2263%
Validation for epoch 3:
-> Valid. time: 2.9044s, loss: 0.3668, accuracy: 86.1613%
Training epoch 4:
Avg. batch proc. time: 0.0477s, loss: 0.4120: 100%|█████████████████████████████████████████████████████████| 17168/17168 [14:14<00:00, 20.08it/s]
-> Training time: 854.8605s, loss = 0.4120, accuracy: 84.4212%
Validation for epoch 4:
-> Valid. time: 2.9331s, loss: 0.3626, accuracy: 86.4966%
Training epoch 5:
Avg. batch proc. time: 0.0477s, loss: 0.3917: 100%|█████████████████████████████████████████████████████████| 17168/17168 [14:14<00:00, 20.08it/s]
-> Training time: 854.8143s, loss = 0.3917, accuracy: 85.3156%
Validation for epoch 5:
-> Valid. time: 2.9344s, loss: 0.3559, accuracy: 86.7608%
Training epoch 6:
Avg. batch proc. time: 0.0476s, loss: 0.3766: 100%|█████████████████████████████████████████████████████████| 17168/17168 [14:14<00:00, 20.10it/s]
-> Training time: 854.1151s, loss = 0.3766, accuracy: 85.9788%
Validation for epoch 6:
-> Valid. time: 2.9510s, loss: 0.3426, accuracy: 87.2892%
Training epoch 7:
Avg. batch proc. time: 0.0477s, loss: 0.3639: 100%|█████████████████████████████████████████████████████████| 17168/17168 [14:15<00:00, 20.08it/s]
-> Training time: 855.0051s, loss = 0.3639, accuracy: 86.5372%
Validation for epoch 7:
-> Valid. time: 2.9162s, loss: 0.3464, accuracy: 87.7058%`
No matter in which way, I get the error "ModuleNotFoundError: No module named 'esim'". Thank you very much!
➜ preprocessing git:(master) pwd
~/tmp/ESIM/scripts/preprocessing
➜ preprocessing git:(master) python preprocess_snli.py
Traceback (most recent call last):
File "preprocess_snli.py", line 12, in <module>
from esim.data import Preprocessor
ModuleNotFoundError: No module named 'esim'
➜ scripts git:(master) pwd
~/tmp/ESIM/scripts
➜ scripts git:(master) python preprocessing/preprocess_snli.py
Traceback (most recent call last):
File "preprocessing/preprocess_snli.py", line 12, in <module>
from esim.data import Preprocessor
ModuleNotFoundError: No module named 'esim'
➜ ESIM git:(master) pwd
~/tmp/ESIM
➜ ESIM git:(master) python scripts/preprocessing/preprocess_snli.py
Traceback (most recent call last):
File "scripts/preprocessing/preprocess_snli.py", line 12, in <module>
from esim.data import Preprocessor
ModuleNotFoundError: No module named 'esim'
I am having very hard time in understanding how to predict for a new set of premise, hypothesis pair?? #help
Line 190 in fd335c3
as the code reference above, why initialize the parameter bias_hh_l0[hidden_size:(2*hidden_size)]
to constant value 1.0
instead of setting all to zero ?
Hi
Don't have the preprocess_scitail.py?
Didn't check the scitail dataset?
Hi
Since I don't have access to GPU, I can't execute your code, but there is another code in the github that implements your model with the keras Library . Are you confirming the following code and correct?
"""
Implementation of ESIM(Enhanced LSTM for Natural Language Inference)
https://arxiv.org/abs/1609.06038
"""
import numpy as np
from keras.layers import *
from keras.activations import softmax
from keras.models import Model
def StaticEmbedding(embedding_matrix):
in_dim, out_dim = embedding_matrix.shape
return Embedding(in_dim, out_dim, weights=[embedding_matrix], trainable=False)
def subtract(input_1, input_2):
minus_input_2 = Lambda(lambda x: -x)(input_2)
return add([input_1, minus_input_2])
def aggregate(input_1, input_2, num_dense=300, dropout_rate=0.5):
feat1 = concatenate([GlobalAvgPool1D()(input_1), GlobalMaxPool1D()(input_1)])
feat2 = concatenate([GlobalAvgPool1D()(input_2), GlobalMaxPool1D()(input_2)])
x = concatenate([feat1, feat2])
x = BatchNormalization()(x)
x = Dense(num_dense, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(dropout_rate)(x)
x = Dense(num_dense, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(dropout_rate)(x)
return x
def align(input_1, input_2):
attention = Dot(axes=-1)([input_1, input_2])
w_att_1 = Lambda(lambda x: softmax(x, axis=1))(attention)
w_att_2 = Permute((2,1))(Lambda(lambda x: softmax(x, axis=2))(attention))
in1_aligned = Dot(axes=1)([w_att_1, input_1])
in2_aligned = Dot(axes=1)([w_att_2, input_2])
return in1_aligned, in2_aligned
def build_model(embedding_matrix, num_class=1, max_length=30, lstm_dim=300):
q1 = Input(shape=(max_length,))
q2 = Input(shape=(max_length,))
embedding = StaticEmbedding(embedding_matrix)
q1_embed = BatchNormalization(axis=2)(embedding(q1))
q2_embed = BatchNormalization(axis=2)(embedding(q2))
encode = Bidirectional(LSTM(lstm_dim, return_sequences=True))
q1_encoded = encode(q1_embed)
q2_encoded = encode(q2_embed)
q1_aligned, q2_aligned = align(q1_encoded, q2_encoded)
q1_combined = concatenate([q1_encoded, q2_aligned, subtract(q1_encoded, q2_aligned), multiply([q1_encoded, q2_aligned])])
q2_combined = concatenate([q2_encoded, q1_aligned, subtract(q2_encoded, q1_aligned), multiply([q2_encoded, q1_aligned])])
compare = Bidirectional(LSTM(lstm_dim, return_sequences=True))
q1_compare = compare(q1_combined)
q2_compare = compare(q2_combined)
x = aggregate(q1_compare, q2_compare)
x = Dense(num_class, activation='sigmoid')(x)
return Model(inputs=[q1, q2], outputs=x)
link github: https://gist.github.com/namakemono/b74547e82ef9307da9c29057c650cdf1
Hi
Thanks for the implementation.
Is this code executed on Google Colab?
I ran it on Google Colab and got an error on epoch 15?
Buffered data was truncated after reaching the output size limit.
Do I need to run until epoch 64?
On which epochs will perform best?
Hi, here is a question that I try to use the ESIM to realize a binary classification task. I got more than 90% train/dev accuracy by using other models, but I got 50% accuracy by using ESIM. Does anyone have met similar issue before? Any suggestion will be great! Thanks a lot.
I have run the code for many times, but the test result fails to reach 88, which is only about 87.6. Is there any detail that needs attention?
Can you please state the reason as to why HIM (Hybrid Inference Model) is not implemented in many ESIM implementations.
Is their no visible improvement when Tree LSTM is added ?
which is advisable plain ESIM or HIM (considering the time to get an inference too)
or is BiMPM better of the three?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.