Cleaned code for paper "Natural Language Inference over Interaction Space"

License: Apache License 2.0

Python 100.00%

densely-interactive-inference-network's Issues

Add a script to auto download datasets

I like how other similar projects allow you to auto download datasets simply by running a script, so here you go! GH-4

How to train on Quora dataset?

How can I apply this approach to Quora dataset? I saw the train_quora.py file, but what files are needed for it and how do I generate them?

Does anyone has a pretrain model here?

Does anyone has a pretrain model here? And how to use it?

filter_heights only 5?

pa("--filter_heights", type=str, default="5")

why not 3,4,5 ?
Thank you!
@YichenGong

AssertionError on save_submission

Densely-Interactive-Inference-Network/python/train_mnli.py

Line 478 in b36b922

save_submission(path, IDs, logits[1:])

save_submission(path, IDs, logits[1:]) should be just save_submission(path, IDs, logits)

Densely-Interactive-Inference-Network/python/train_mnli.py

Line 477 in b36b922

logits = np.argmax(logits[1:], axis=1)

have sliced logits already

Error on PYTHONHASHSEED=0 python3 train_mnli.py DIIN demo_testing_SNLI --training_completely_on_snli

Everything is OK up to loading shared.jsonl but when data_processing.py tries to load it the following error is raised:

[1] Loading data SNLI
550152it [00:09, 55502.23it/s]
10000it [00:00, 58533.31it/s]
10000it [00:00, 64180.29it/s]
[1] Loading data MNLI
392702it [00:07, 50719.45it/s]
10000it [00:00, 61434.57it/s]
10000it [00:00, 60510.07it/s]
9796it [00:00, 57942.87it/s]
9847it [00:00, 56161.09it/s]
../data/shared.jsonl
Traceback (most recent call last):
  File "train_mnli.py", line 68, in <module>
    shared_content = load_mnli_shared_content()
  File "..../Densely-Interactive-Inference-Network/python/util/data_processing.py", line 173, in load_mnli_shared_content
    assert shared_file_exist
AssertionError

It seems like the data downloader downloads shared.json instead of shared.jsonl while the training script tries to load a .jsonl file.

Everything works fine when downloading shared.jsonl without a script.
Also you could rename shared.json (which redirects to downloading the file) to shared.jsonl in README

cannot find key 'sentence1_binary_parse_index_sequence'

Hello,
I am going through your code trying to understand the model.

in get_minibatch function there is a line

premise_vectors = fill_feature_vector_with_cropping_or_padding([dataset[i]['sentence1_binary_parse_index_sequence'][:] for i in indices], premise_pad_crop_pair, 1)

'sentence1_binary_parse_index_sequence' is supposed to be a key in train_snli list of dictionaries.

However, I cannot find where you create this key. Original snli set does not have it.

Regards,

No multinli_0.9_test_matched_unlabeled.jsonl

just about the readme section, the result tree data should be

data
├── download.py
├── embeddings
│   └── mnli_emb_snli_embedding.pkl.gz
├── glove.840B.300d.txt
├── glove.840B.300d.zip
├── __MACOSX
│   ├── multinli_0.9
│   └── snli_1.0
├── multinli_0.9
│   ├── Icon\015
│   ├── multinli_0.9_dev_matched.jsonl
│   ├── multinli_0.9_dev_matched.txt
│   ├── multinli_0.9_dev_mismatched.jsonl
│   ├── multinli_0.9_dev_mismatched.txt
│   ├── multinli_0.9_test_matched_unlabeled.jsonl
│   ├── multinli_0.9_test_mismatched_unlabeled.jsonl
│   ├── multinli_0.9_train.jsonl
│   ├── multinli_0.9_train.txt
│   └── paper.pdf
├── multinli_0.9.zip
├── shared.json
├── shared.jsonl
├── snli_1.0
│   ├── Icon\015
│   ├── README.txt
│   ├── snli_1.0_dev.jsonl
│   ├── snli_1.0_dev.txt
│   ├── snli_1.0_test.jsonl
│   ├── snli_1.0_test.txt
│   ├── snli_1.0_train.jsonl
│   └── snli_1.0_train.txt
└── snli_1.0.zip

6 directories, 26 files

There should be a multinli_0.9_test_matched_unlabeled.jsonl and multinli_0.9_test_mismatched_unlabeled.jsonl file to run the entire code.
And also ref to #5 , there should be a shared.jsonl file.

yichengong / densely-interactive-inference-network Goto Github PK

densely-interactive-inference-network's Issues

Add a script to auto download datasets

How to train on Quora dataset?

Does anyone has a pretrain model here?

filter_heights only 5?

Import error

AssertionError on save_submission

Error on PYTHONHASHSEED=0 python3 train_mnli.py DIIN demo_testing_SNLI --training_completely_on_snli

cannot find key 'sentence1_binary_parse_index_sequence'

No multinli_0.9_test_matched_unlabeled.jsonl

This script is run on CPU not GPU by default

Pre-processing

bi_attention_mx do not use the mask?

Max pooling in character embedding.

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent