Giter Club home page Giter Club logo

esim's People

Contributors

coetaur0 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

esim's Issues

ImportError in train_snli.py

-sorry, stupid question, found it myself.-

If I try to run train_snli.py, an error occurs:

Traceback (most recent call last):
File "train_snli.py", line 19, in
from utils import train, validate
ImportError: cannot import name 'train'

This seems justified, because esim/esim/utils.py do not include train and validate.
Is this call referring to another script? Or what shall be done?

Getting Segmentation fault while training on MNLI

==================== Training ESIM model on device: cuda:0 ====================

  • Training epoch 1:
    Avg. batch proc. time: 0.0299s, loss: 0.8712: 100%|█████████████████████████████████████████████████████████████| 49088/49088 [26:49<00:00, 30.49it/s]
    Segmentation fault (core dumped)

training loss is not reduced and accuracy is not improved during training

Dear author,

When I was running train_mnli.py and train_snli.py, I met the same device problem with #15. Then I set the device of idx_range (line 40-41 in esim/utils.py) to the correct device and it was solved.
396d7d00343b6b0b603916421b9f1a2

(I don't know whether this change leads to the following problem so I list the change here.)

But I met a new problem, i.e., the training loss is not reduced and accuracy is not improved during training.
69ba4609f904f4274251520b3658157
image

Could you please help me on this? Thanks a lot!

What is the BNLI dataset?

What is the Breaking NLI (BNLI) dataset? And where can I download the BNLI dataset? I cannot find the BNLI dataset by searching "Breaking NLI (BNLI)" or "Breaking NLI (BNLI) dataset" in Google.
Thank you very much!

Test on MNLI model which was trained on SNLI

Hi,
For research purposes, I am attempting to train the ESIM model on the SNLI dataset, and then evaluate the classifier on the MNLI dataset (e.g. on the dev set). I tried to do this by calling the test_snli.py script and give it as a test-set the dev set for the (matched) MNLI dataset (after preprocessing). I get the following error:

Traceback (most recent call last):
File "test_snli.py", line 132, in
args.batch_size)
File "test_snli.py", line 113, in main
batch_time, total_time, accuracy = test(model, test_loader)
File "test_snli.py", line 54, in test
hypotheses_lengths)
File "/specific/netapp5/joberant/nlp_fall_2020/liaderez/nlp_project/ESIM/esim_env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/lib/python3.7/site-packages/esim/model.py", line 128, in forward
premises_lengths)
File "/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/lib/python3.7/site-packages/esim/layers.py", line 117, in forward
batch_first=True)
File "/lib/python3.7/site-packages/torch/nn/utils/rnn.py", line 223, in pack_padded_sequence
lengths = torch.as_tensor(lengths, dtype=torch.int64)
RuntimeError: CUDA error: device-side assert triggered

I looked at both datasets and it seems the data is formatted in the same way, do I do not know why there is an issue.
Does the code perhaps support training a model on some dataset and testing it on another? If not, any suggestions about how I should do it?

Thanks!

No such file or directory: worddict.pkl

Hello, genius, I just want to know where is the worddict.pkl when I run 'preprocess_bnli.py --config ...bnli_preprocessing.json'.
File "preprocess_bnli.py", line 73, in preprocess_BNLI_data
with open(worddict, 'rb') as pkl:
FileNotFoundError: [Errno 2] No such file or directory: '../data/preprocessed/SNLI/worddict.pkl'

Validation loss lower than training loss?

Throughout my first 7 epochs, loss is always lower on validation set rather than training set. Did anything go wrong or the validation set is just too easy?

`* Training epoch 2:
Avg. batch proc. time: 0.0477s, loss: 0.4858: 100%|█████████████████████████████████████████████████████████| 17168/17168 [14:15<00:00, 20.07it/s]
-> Training time: 855.3887s, loss = 0.4858, accuracy: 81.1468%

  • Validation for epoch 2:
    -> Valid. time: 3.0217s, loss: 0.3845, accuracy: 85.4095%

  • Training epoch 3:
    Avg. batch proc. time: 0.0476s, loss: 0.4385: 100%|█████████████████████████████████████████████████████████| 17168/17168 [14:13<00:00, 20.12it/s]
    -> Training time: 853.2559s, loss = 0.4385, accuracy: 83.2263%

  • Validation for epoch 3:
    -> Valid. time: 2.9044s, loss: 0.3668, accuracy: 86.1613%

  • Training epoch 4:
    Avg. batch proc. time: 0.0477s, loss: 0.4120: 100%|█████████████████████████████████████████████████████████| 17168/17168 [14:14<00:00, 20.08it/s]
    -> Training time: 854.8605s, loss = 0.4120, accuracy: 84.4212%

  • Validation for epoch 4:
    -> Valid. time: 2.9331s, loss: 0.3626, accuracy: 86.4966%

  • Training epoch 5:
    Avg. batch proc. time: 0.0477s, loss: 0.3917: 100%|█████████████████████████████████████████████████████████| 17168/17168 [14:14<00:00, 20.08it/s]
    -> Training time: 854.8143s, loss = 0.3917, accuracy: 85.3156%

  • Validation for epoch 5:
    -> Valid. time: 2.9344s, loss: 0.3559, accuracy: 86.7608%

  • Training epoch 6:
    Avg. batch proc. time: 0.0476s, loss: 0.3766: 100%|█████████████████████████████████████████████████████████| 17168/17168 [14:14<00:00, 20.10it/s]
    -> Training time: 854.1151s, loss = 0.3766, accuracy: 85.9788%

  • Validation for epoch 6:
    -> Valid. time: 2.9510s, loss: 0.3426, accuracy: 87.2892%

  • Training epoch 7:
    Avg. batch proc. time: 0.0477s, loss: 0.3639: 100%|█████████████████████████████████████████████████████████| 17168/17168 [14:15<00:00, 20.08it/s]
    -> Training time: 855.0051s, loss = 0.3639, accuracy: 86.5372%

  • Validation for epoch 7:
    -> Valid. time: 2.9162s, loss: 0.3464, accuracy: 87.7058%`

ModuleNotFoundError: No module named 'esim'

No matter in which way, I get the error "ModuleNotFoundError: No module named 'esim'". Thank you very much!

➜  preprocessing git:(master) pwd
~/tmp/ESIM/scripts/preprocessing
➜  preprocessing git:(master) python preprocess_snli.py
Traceback (most recent call last):
  File "preprocess_snli.py", line 12, in <module>
    from esim.data import Preprocessor
ModuleNotFoundError: No module named 'esim'
➜  scripts git:(master) pwd
~/tmp/ESIM/scripts
➜  scripts git:(master) python preprocessing/preprocess_snli.py
Traceback (most recent call last):
  File "preprocessing/preprocess_snli.py", line 12, in <module>
    from esim.data import Preprocessor
ModuleNotFoundError: No module named 'esim'
➜  ESIM git:(master) pwd
~/tmp/ESIM
➜  ESIM git:(master) python scripts/preprocessing/preprocess_snli.py
Traceback (most recent call last):
  File "scripts/preprocessing/preprocess_snli.py", line 12, in <module>
    from esim.data import Preprocessor
ModuleNotFoundError: No module named 'esim'

dataset scitail

Hi
Don't have the preprocess_scitail.py?
Didn't check the scitail dataset?

ESIM using keras

Hi
Since I don't have access to GPU, I can't execute your code, but there is another code in the github that implements your model with the keras Library . Are you confirming the following code and correct?

"""
Implementation of ESIM(Enhanced LSTM for Natural Language Inference)
https://arxiv.org/abs/1609.06038
"""
import numpy as np
from keras.layers import *
from keras.activations import softmax
from keras.models import Model

def StaticEmbedding(embedding_matrix):
in_dim, out_dim = embedding_matrix.shape
return Embedding(in_dim, out_dim, weights=[embedding_matrix], trainable=False)

def subtract(input_1, input_2):
minus_input_2 = Lambda(lambda x: -x)(input_2)
return add([input_1, minus_input_2])

def aggregate(input_1, input_2, num_dense=300, dropout_rate=0.5):
feat1 = concatenate([GlobalAvgPool1D()(input_1), GlobalMaxPool1D()(input_1)])
feat2 = concatenate([GlobalAvgPool1D()(input_2), GlobalMaxPool1D()(input_2)])
x = concatenate([feat1, feat2])
x = BatchNormalization()(x)
x = Dense(num_dense, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(dropout_rate)(x)
x = Dense(num_dense, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(dropout_rate)(x)
return x

def align(input_1, input_2):
attention = Dot(axes=-1)([input_1, input_2])
w_att_1 = Lambda(lambda x: softmax(x, axis=1))(attention)
w_att_2 = Permute((2,1))(Lambda(lambda x: softmax(x, axis=2))(attention))
in1_aligned = Dot(axes=1)([w_att_1, input_1])
in2_aligned = Dot(axes=1)([w_att_2, input_2])
return in1_aligned, in2_aligned

def build_model(embedding_matrix, num_class=1, max_length=30, lstm_dim=300):
q1 = Input(shape=(max_length,))
q2 = Input(shape=(max_length,))

Embedding

embedding = StaticEmbedding(embedding_matrix)
q1_embed = BatchNormalization(axis=2)(embedding(q1))
q2_embed = BatchNormalization(axis=2)(embedding(q2))

Encoding

encode = Bidirectional(LSTM(lstm_dim, return_sequences=True))
q1_encoded = encode(q1_embed)
q2_encoded = encode(q2_embed)

Alignment

q1_aligned, q2_aligned = align(q1_encoded, q2_encoded)

Compare

q1_combined = concatenate([q1_encoded, q2_aligned, subtract(q1_encoded, q2_aligned), multiply([q1_encoded, q2_aligned])])
q2_combined = concatenate([q2_encoded, q1_aligned, subtract(q2_encoded, q1_aligned), multiply([q2_encoded, q1_aligned])])
compare = Bidirectional(LSTM(lstm_dim, return_sequences=True))
q1_compare = compare(q1_combined)
q2_compare = compare(q2_combined)

Aggregate

x = aggregate(q1_compare, q2_compare)
x = Dense(num_class, activation='sigmoid')(x)

return Model(inputs=[q1, q2], outputs=x)

link github: https://gist.github.com/namakemono/b74547e82ef9307da9c29057c650cdf1

Buffered data was truncated after reaching the output size limit.

Hi
Thanks for the implementation.
Is this code executed on Google Colab?

I ran it on Google Colab and got an error on epoch 15?

Buffered data was truncated after reaching the output size limit.

Do I need to run until epoch 64?
On which epochs will perform best?

50% train/dev accuracy for a binary classification task

Hi, here is a question that I try to use the ESIM to realize a binary classification task. I got more than 90% train/dev accuracy by using other models, but I got 50% accuracy by using ESIM. Does anyone have met similar issue before? Any suggestion will be great! Thanks a lot.

Complete Esim implementation

Can you please state the reason as to why HIM (Hybrid Inference Model) is not implemented in many ESIM implementations.
Is their no visible improvement when Tree LSTM is added ?
which is advisable plain ESIM or HIM (considering the time to get an inference too)
or is BiMPM better of the three?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.