najoungkim / pdtb3 Goto Github PK
View Code? Open in Web Editor NEWPreprocessing code and BERT/XLNet baselines for PDTB 2.0 and 3.0
License: Apache License 2.0
Preprocessing code and BERT/XLNet baselines for PDTB 2.0 and 3.0
License: Apache License 2.0
Hi. Do you have any plans to expand this repo/research to explicit discourse relation classification? This is a useful resource but it is shame that it is only limited to implicit. Especially considering there are several other areas of discourse parsing: explicit, connective classification, and etc. to build an End-to-End discourse parser.
hello!
I follow the requirements(Python 3.7.3,PyTorch 1.1.0,CUDA 9.0.176) , but the following problems have been encountered:
RuntimeError: cublas runtime error : the GPU program failed to execute at /opt/conda/conda-bld/pytorch_1556653215914/work/aten/src/THC/THCBlas.cu:450
Have you ever had such a problem..?
just want to thank you for sharing the preprocessing scripts :)
I cannot reproduce the experimental result shown in Table 1. In my environment, BERT (base, uncased) achieves 46.7% X-accuracy, which is about 3% lower than your score.
I downloaded PDTB 2.0 in CSV format and preprocessed it using your script.
$ bash pdtb2_setup.sh
$ python preprocess/preprocess_pdtb2.py --data_file data/pdtb2.csv --output_dir data/pdtb2_xval --split xval
row 40600
Cross-validation fold 1
Label counts: {'Expansion.Restatement': 3206, 'Expansion.Conjunction': 3534, 'Contingency.Cause': 4172, 'Comparison.Contrast': 2120, 'Expansion.Instantiation': 1445, 'Temporal.Asynchronous': 697, 'Expansion.Alternative': 185, 'Contingency.Pragmatic cause': 83, 'Comparison.Concession': 223, 'Expansion.List': 400, 'Temporal.Synchrony': 251}
Total: 16316
...
Cross-validation fold 12
Label counts: {'Expansion.Restatement': 3206, 'Expansion.Conjunction': 3534, 'Contingency.Cause': 4172, 'Comparison.Contrast': 2120, 'Expansion.Instantiation': 1445, 'Temporal.Asynchronous': 697, 'Expansion.Alternative': 185, 'Contingency.Pragmatic cause': 83, 'Comparison.Concession': 223, 'Expansion.List': 400, 'Temporal.Synchrony': 251}
Total: 16316
Mean train: 13676.25
Mean dev: 1280.9166666666667
Mean test: 1273.1666666666667
I installed pytorch-transformers
in this repo. Before installing it, I removed line 13 of pytorch-transformers/pytorch_transformers/__init__.py
because the line imports undeclared modules: BertForPDTB
and BertForPDTBClassification
.
I ran pytorch-transformers/examples/run_pdtb.py
on each fold. I followed the parameters written in the paper.
$ python src/pytorch-transformers/examples/run_pdtb.py \
--data_dir data/pdtb2_xval/fold_1 \
--model_type bert \
--model_name_or_path bert-base-uncased \
--task_name pdtb2_level2 \
--output_dir data/reproduce/pdtb2_level2/bert-base-uncased/fold_1 \
--do_train \
--do_eval \
--evaluate_during_training \
--do_lower_case \
--per_gpu_train_batch_size 8 \
--per_gpu_eval_batch_size 8 \
--gradient_accumulation_steps 1 \
--learning_rate 5e-06 \
--weight_decay 0.0 \
--adam_epsilon 1e-08 \
--max_grad_norm 1.0 \
--num_train_epochs 10.0 \
--max_steps -1 \
--warmup_steps 0 \
--logging_steps 500 \
--save_steps 500 \
--validation_metric acc \
--patience 5 \
--deterministic
07/01/2020 10:49:26 - WARNING - __main__ - Process rank: -1, device: cuda, n_gpu: 1, distributed training: False, 16-bits training: False
...
07/01/2020 11:26:55 - INFO - pytorch_transformers.modeling_utils - loading weights file data/reproduce/pdtb2_level2/bert-base-uncased/fold_2/pytorch_model.bin
07/01/2020 11:27:03 - INFO - __main__ - Loading features from cached file data/pdtb2_xval/fold_2/cached_dev_bert-base-uncased_128_pdtb2_level2
07/01/2020 11:27:03 - INFO - __main__ - ***** Running evaluation *****
07/01/2020 11:27:03 - INFO - __main__ - Num examples = 1133
07/01/2020 11:27:03 - INFO - __main__ - Batch size = 8
Evaluating: 100%|██████████| 142/142 [00:05<00:00, 25.06it/s]
07/01/2020 11:27:09 - INFO - __main__ - ***** Eval results :dev *****
07/01/2020 11:27:09 - INFO - __main__ - acc = 0.46954986760812
07/01/2020 11:27:09 - INFO - __main__ - loss = 1.6523201620914567
07/01/2020 11:27:09 - INFO - __main__ - Loading features from cached file data/pdtb2_xval/fold_2/cached_test_bert-base-uncased_128_pdtb2_level2
07/01/2020 11:27:09 - INFO - __main__ - ***** Running evaluation *****
07/01/2020 11:27:09 - INFO - __main__ - Num examples = 1165
07/01/2020 11:27:09 - INFO - __main__ - Batch size = 8
Evaluating: 100%|██████████| 146/146 [00:05<00:00, 24.81it/s]
07/01/2020 11:27:15 - INFO - __main__ - ***** Eval results :test *****
07/01/2020 11:27:15 - INFO - __main__ - acc = 0.47124463519313303
07/01/2020 11:27:15 - INFO - __main__ - loss = 1.5675792198066842
I then got an X-accuracy by averaging test accuracy over folds but it was much lower than your score. (*The standard deviation shown here was calculated over folds, not different random seeds.)
------------ ----------------- -------------
pdtb2_level2 bert-base-uncased 46.72 (±1.31)
------------ ----------------- -------------
I also tried the latest version of pytorch-transformers
, but I couldn’t reproduce the result likewise.
------------ ----------------- -------------
pdtb2_level2 bert-base-uncased 46.39 (±1.11)
xlnet-base-cased 51.78 (±1.51)
------------ ----------------- -------------
The pdtb3_make_splits_xval function in ./preprocess [preprocess_pdtb3.py seems to confuse the cases of random split and cross validation in the code. Could you verify if this is correct?
When trying to run the run.py script to train an xlne-large-uncased for the PDTB2_LEVEL1 task, a strange RuntimeError occurs.
The script should start the training process and then evaluate at the end.
After downloading the PDTB2.0 corpus in CSV format, I ran your preprocessing script as follows:
$ python preprocess/preprocess_pdtb2.py --data_file data/pdtb2.csv \
--output_dir data/pdtb2_patterson_L1 \
--split single \
--split_name patterson \
--label_type L1
row 40600
Label count: {'Expansion': 8861, 'Comparison': 2503, 'Contingency': 4255, 'Temporal': 950}
After the preprocess, I ran the src/pytorch-transformers/examples/run_pdtb.py
with th following parameters:
python ../src/pytorch-transformers/examples/run_pdtb.py \
--model_type xlnet \
--task_name PDTB2_LEVEL1 \
--model_name_or_path xlnet-large-cased \
--do_train \
--evaluate_during_training \
--do_eval \
--data_dir ../data/pdtb2_patterson_L1/ \
--max_seq_length 128 \
--per_gpu_eval_batch_size 32 \
--per_gpu_train_batch_size 8 \
--learning_rate 2e-6 \
--num_train_epochs 10.0 \
--output_dir output/output_xlnet_patterson_L1 \
--save_steps 500 \
--logging_steps 500 \
--seed 1 \
--validation_metric acc \
--n_gpu 1 \
--cuda_no 0 \
--deterministic
And the error message is the following:
Traceback (most recent call last):
File "../src/pytorch-transformers/examples/run_pdtb.py", line 592, in <module>
main()
File "../src/pytorch-transformers/examples/run_pdtb.py", line 518, in main
model = model_class.from_pretrained(args.model_name_or_path, from_tf=bool('.ckpt' in args.model_name_or_path), config=config)
File "/home/aq/.local/lib/python3.8/site-packages/pytorch_transformers/modeling_utils.py", line 536, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
File "/home/aq/.local/lib/python3.8/site-packages/pytorch_transformers/modeling_xlnet.py", line 1110, in __init__
self.transformer = XLNetModel(config)
File "/home/aq/.local/lib/python3.8/site-packages/pytorch_transformers/modeling_xlnet.py", line 731, in __init__
self.word_embedding = nn.Embedding(config.n_token, config.d_model)
File "/home/aq/.local/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 137, in __init__
self.weight = Parameter(torch.Tensor(num_embeddings, embedding_dim))
RuntimeError: Trying to create tensor with negative dimension -1: [-1, 1024]
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.