hint-lab / bert-relation-classification Goto Github PK

A pytorch implementation of BERT-based relation classification

Python 71.71% Perl 28.11% Shell 0.18%

bert-relation-classification's Introduction

A Pytorch Implementation of BERT-based Relation Classification

This is a stable pytorch implementation of Enriching Pre-trained Language Model with Entity Information for Relation Classification https://arxiv.org/abs/1905.08284.

Requirements:

Python version >= 3.6 (recommended)

Pytorch version >= 1.1 (recommended)

pytorch-transformers: https://github.com/huggingface/pytorch-transformers
!!! pytorch-transformers = 1.1

Tutorial of the code

Download the project and prepare the data

> git clone https://github.com/wang-h/bert-relation-classification
> cd bert-relation-classification

Train the bert-based classification model

> python bert.py --config config.ini

...
09/11/2019 16:36:31 - INFO - pytorch_transformers.modeling_utils -   loading weights file /tmp/semeval/pytorch_model.bin
09/11/2019 16:36:33 - INFO - __main__ -   Loading features from cached file ./dataset/cached_dev_bert-large-uncased_128_semeval
09/11/2019 16:36:33 - INFO - __main__ -   Saving features into cached file ./dataset/cached_dev_bert-large-uncased_128_semeval
09/11/2019 16:36:34 - INFO - __main__ -   ***** Running evaluation  *****
09/11/2019 16:36:34 - INFO - __main__ -     Num examples = 2717
09/11/2019 16:36:34 - INFO - __main__ -     Batch size = 8
Evaluating: 100%|████████████████████████████████████████████████████| 340/340 [00:46<00:00,  7.24it/s]
09/11/2019 16:37:21 - INFO - __main__ -   ***** Eval results  *****  
10/07/2019 10:02:23 - INFO - __main__ -     acc = 0.8579315421420685
10/07/2019 10:02:23 - INFO - __main__ -     acc_and_f1 = 0.8579315421420685
10/07/2019 10:02:23 - INFO - __main__ -     f1 = 0.8579315421420685

Evaluate using the official script for SemEval task-8

> cd eval
> bash test.sh
> cat res.txt

(the result reported in the paper, tensorflow) MACRO-averaged result (excluding Other, uncased-large-model): 89.25 
(this pytorch implementation) MACRO-averaged result (excluding Other, uncased-large-model): 89.25 (same)

I also have the source code written in tensorflow. Feel free to contact me if you need it.

We also appreciate if you could cite our recent paper with the best result (90.36).

Enhancing Relation Extraction Using Syntactic Indicators and Sentential Contexts

https://arxiv.org/abs/1912.01858

An Extensible Framework of Leveraging Syntactic Skeleton for Semantic Relation Classification, ACM TALLIP, September 2020

https://dl.acm.org/doi/10.1145/3402885

bert-relation-classification's People

Contributors

Stargazers

Watchers

bert-relation-classification's Issues

could you please give me a tensorflow one?

hello, I will appreciate you if you could give me a tensorflow one, want to learn it.this is my email: [email protected] you very match.

pytorch_transformers became transformers

Just wanted to let you guys know that instead of "from pytorch_transformers import ..." it is renamed to just transformers, so "from transformers import ..."

i tried to run the pytorch code but having some errors, as i downloaded everything you mentioned but still.

10/30/2019 19:20:06 - INFO - main - using L2 regularization with lambda 0.00500
10/30/2019 19:20:08 - INFO - pytorch_transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at C:\Users\EJO.cache\torch\pytorch_transformers\26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
10/30/2019 19:20:11 - INFO - pytorch_transformers.modeling_utils - loading weights file https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-pytorch_model.bin from cache at C:\Users\EJO.cache\torch\pytorch_transformers\54da47087cc86ce75324e4dc9bbb5f66c6e83a7c6bd23baea8b489acc8d09aa4.4d5343a4b979c4beeaadef17a0453d1bb183dd9b084f58b84c7cc781df343ae6
Traceback (most recent call last):
File "bert.py", line 427, in
main()
File "bert.py", line 357, in main
config.pretrained_model_name, config=bertconfig)
File "C:\Users\EJO\Anaconda3\envs\venv\lib\site-packages\pytorch_transformers\modeling_utils.py", line 536, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
File "C:\Users\EJO\bert-relation-classification\model.py", line 66, in init
self.apply(self.init_weights)
File "C:\Users\EJO\Anaconda3\envs\venv\lib\site-packages\torch\nn\modules\module.py", line 242, in apply
module.apply(fn)
File "C:\Users\EJO\Anaconda3\envs\venv\lib\site-packages\torch\nn\modules\module.py", line 242, in apply
module.apply(fn)
File "C:\Users\EJO\Anaconda3\envs\venv\lib\site-packages\torch\nn\modules\module.py", line 242, in apply
module.apply(fn)
File "C:\Users\EJO\Anaconda3\envs\venv\lib\site-packages\torch\nn\modules\module.py", line 243, in apply
fn(self)
TypeError: init_weights() takes 1 positional argument but 2 were given

(venv) C:\Users\EJO\bert-relation-classification>

Error while trying to use the model

Traceback (most recent call last):
File "bert.py", line 429, in
main()
File "bert.py", line 373, in main
config, config.task_name, tokenizer, evaluate=False)
File "bert.py", line 268, in load_and_cache_examples
examples, label_list, config.max_seq_len, tokenizer, "classification", use_entity_indicator=confi
g.use_entity_indicator)
File "C:\Users\pilanisp\Desktop\BERT FINAL\BERT IE\bert-relation-classification\utils.py", line 281
, in convert_examples_to_features
e11_p = tokens_a.index("#")+1 # the start position of entity1
ValueError: '#' is not in list

Where can I find the code?

In the paper "Enhancing Relation Extraction Using Syntactic Indicators and Sentential Contexts", you got f1=90.36 , Where can I find the code?

Question about the dataset

I have problem in understanding the structure of the dataset, for instance this raw from /data/train.tsv:
16 texas - born [E11] virtuoso [E12] finds harmony sophistication in appalachian [E21] instrument [E22] 6 agency instrument 2

16 is the row number
E[11] virtuoso [E12] is the entitity
[E21] instrument [E22]i s the second entity
but I have problem in this: 6 agency instrument 2

I believe that agency is the entity type for E1 and instrument is the entity Type for E2 but the 6 and the 2 what are they meaning?

I am asking you because I need to find relation between person and company, for instance owns, works, buys, and other possible relations.
Thanks in advance...

error

when i type
python bert.py --config config.ini

Traceback (most recent call last):
File "bert.py", line 427, in
main()
File "bert.py", line 357, in main
config.pretrained_model_name, config=bertconfig)
File "/usr/local/lib/python3.6/dist-packages/pytorch_transformers/modeling_utils.py", line 536, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
File "/content/drive/My Drive/bert-relation-classification-master/model.py", line 66, in init
self.apply(self.init_weights)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 440, in apply
module.apply(fn)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 440, in apply
module.apply(fn)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 440, in apply
module.apply(fn)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 441, in apply
fn(self)
TypeError: init_weights() takes 1 positional argument but 2 were given

Meet an error in init_weights()

Hello! Thank you for your implementation. I have an error when I ran the code.

python = 3.7
pytorch = 1.2.0
pytorch-transformers = 1.2.0

Traceback (most recent call last):
File "/home/lab/code/bert-relation-classification/bert.py", line 359, in main
model = BertForSequenceClassification.from_pretrained(model_dir, config=bertconfig)
File "/home/lab/anaconda3/lib/python3.7/site-packages/transformers/modeling_utils.py", line 342, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
File "/home/lab/code/bert-relation-classification/model.py", line 66, in init
self.apply(self.init_weights)
File "/home/lab/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 293, in apply
module.apply(fn)
File "/home/lab/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 293, in apply
module.apply(fn)
File "/home/lab/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 293, in apply
module.apply(fn)
File "/home/lab/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 294, in apply
fn(self)
TypeError: init_weights() takes 1 positional argument but 2 were given

Question about example_loss_except_other

Hi! really appreciate your nice work and sharing the code. 😄
Here is a question that just confuses me a little,

dist = one_hot_labels[:, 1:].float() * log_probs[:, 1:]
example_loss_except_other, _ = dist.min(dim=-1)
per_example_loss = - example_loss_except_other.mean()

why does the min of dist equal the loss without other class here?
thanks for your great work again, looking forward to your reply 😆

Where do you implement the Syntactic Indecator

Hi, I can't find the implemention of the Syntactic Indecator from your code. And noticing that in your paper, Contextual Representation part is a concatenation of (H0, He1, He2, z), the classifier_size should be 4, which is actually 3 in your code , missing the Syntactic Indecator part. Besides, I also fail to find the W0 b0, W2 b2, We be, Wz bz part in your code. Would you do me a favour to help me fully understand your implemention?

Shared weight

Hi, could you please explain why W1 and W2 for the two entities are shared? Will this lead to a better result or it is designed for computational efficiency? Thanks so much!