najoungkim / pdtb3 Goto Github PK

View Code? Open in Web Editor NEW

26.0 26.0 9.0 719 KB

Preprocessing code and BERT/XLNet baselines for PDTB 2.0 and 3.0

License: Apache License 2.0

Shell 0.05% Python 66.50% Dockerfile 0.01% Makefile 0.04% CSS 0.29% JavaScript 0.13% Jupyter Notebook 32.99%

pdtb3's People

Contributors

Stargazers

Watchers

Forkers

hkiyomaru katherine-atwell mufeili probe2 charizardacademy brucewlee lingchensanwen swords-fyx callanwu

pdtb3's Issues

Any plans for Explicit?

Hi. Do you have any plans to expand this repo/research to explicit discourse relation classification? This is a useful resource but it is shame that it is only limited to implicit. Especially considering there are several other areas of discourse parsing: explicit, connective classification, and etc. to build an End-to-End discourse parser.

the GPU program failed to execute at /opt/conda/conda-bld/pytorch_1556653215914/work/aten/src/THC/THCBlas.cu:450

hello!
I follow the requirements(Python 3.7.3,PyTorch 1.1.0,CUDA 9.0.176) , but the following problems have been encountered：

RuntimeError: cublas runtime error : the GPU program failed to execute at /opt/conda/conda-bld/pytorch_1556653215914/work/aten/src/THC/THCBlas.cu:450

Have you ever had such a problem..?

thank you!

just want to thank you for sharing the preprocessing scripts :)

Cannot reproduce the experimental result shown in Table 1

What

I cannot reproduce the experimental result shown in Table 1. In my environment, BERT (base, uncased) achieves 46.7% X-accuracy, which is about 3% lower than your score.

Environment

OS: Ubuntu 16.04.6 LTS
CUDA: 10.1, V10.1.243
Python 3.6.5

To reproduce

I downloaded PDTB 2.0 in CSV format and preprocessed it using your script.

$ bash pdtb2_setup.sh
$ python preprocess/preprocess_pdtb2.py --data_file data/pdtb2.csv --output_dir data/pdtb2_xval --split xval
row 40600
Cross-validation fold 1
Label counts:  {'Expansion.Restatement': 3206, 'Expansion.Conjunction': 3534, 'Contingency.Cause': 4172, 'Comparison.Contrast': 2120, 'Expansion.Instantiation': 1445, 'Temporal.Asynchronous': 697, 'Expansion.Alternative': 185, 'Contingency.Pragmatic cause': 83, 'Comparison.Concession': 223, 'Expansion.List': 400, 'Temporal.Synchrony': 251}
Total:  16316
...
Cross-validation fold 12
Label counts:  {'Expansion.Restatement': 3206, 'Expansion.Conjunction': 3534, 'Contingency.Cause': 4172, 'Comparison.Contrast': 2120, 'Expansion.Instantiation': 1445, 'Temporal.Asynchronous': 697, 'Expansion.Alternative': 185, 'Contingency.Pragmatic cause': 83, 'Comparison.Concession': 223, 'Expansion.List': 400, 'Temporal.Synchrony': 251}
Total:  16316
Mean train: 13676.25
Mean dev: 1280.9166666666667
Mean test: 1273.1666666666667

I installed pytorch-transformers in this repo. Before installing it, I removed line 13 of pytorch-transformers/pytorch_transformers/__init__.py because the line imports undeclared modules: BertForPDTB and BertForPDTBClassification.

I ran pytorch-transformers/examples/run_pdtb.py on each fold. I followed the parameters written in the paper.

$ python src/pytorch-transformers/examples/run_pdtb.py \
    --data_dir data/pdtb2_xval/fold_1 \
    --model_type bert \
    --model_name_or_path bert-base-uncased \
    --task_name pdtb2_level2 \
    --output_dir data/reproduce/pdtb2_level2/bert-base-uncased/fold_1 \
    --do_train \
    --do_eval \
    --evaluate_during_training \
    --do_lower_case \
    --per_gpu_train_batch_size 8 \
    --per_gpu_eval_batch_size 8 \
    --gradient_accumulation_steps 1 \
    --learning_rate 5e-06 \
    --weight_decay 0.0 \
    --adam_epsilon 1e-08 \
    --max_grad_norm 1.0 \
    --num_train_epochs 10.0 \
    --max_steps -1 \
    --warmup_steps 0 \
    --logging_steps 500 \
    --save_steps 500 \
    --validation_metric acc \
    --patience 5 \
    --deterministic

07/01/2020 10:49:26 - WARNING - __main__ -   Process rank: -1, device: cuda, n_gpu: 1, distributed training: False, 16-bits training: False
...
07/01/2020 11:26:55 - INFO - pytorch_transformers.modeling_utils -   loading weights file data/reproduce/pdtb2_level2/bert-base-uncased/fold_2/pytorch_model.bin
07/01/2020 11:27:03 - INFO - __main__ -   Loading features from cached file data/pdtb2_xval/fold_2/cached_dev_bert-base-uncased_128_pdtb2_level2
07/01/2020 11:27:03 - INFO - __main__ -   ***** Running evaluation  *****
07/01/2020 11:27:03 - INFO - __main__ -     Num examples = 1133
07/01/2020 11:27:03 - INFO - __main__ -     Batch size = 8
Evaluating: 100%|██████████| 142/142 [00:05<00:00, 25.06it/s]
07/01/2020 11:27:09 - INFO - __main__ -   ***** Eval results :dev *****
07/01/2020 11:27:09 - INFO - __main__ -     acc = 0.46954986760812
07/01/2020 11:27:09 - INFO - __main__ -     loss = 1.6523201620914567
07/01/2020 11:27:09 - INFO - __main__ -   Loading features from cached file data/pdtb2_xval/fold_2/cached_test_bert-base-uncased_128_pdtb2_level2
07/01/2020 11:27:09 - INFO - __main__ -   ***** Running evaluation  *****
07/01/2020 11:27:09 - INFO - __main__ -     Num examples = 1165
07/01/2020 11:27:09 - INFO - __main__ -     Batch size = 8
Evaluating: 100%|██████████| 146/146 [00:05<00:00, 24.81it/s]
07/01/2020 11:27:15 - INFO - __main__ -   ***** Eval results :test *****
07/01/2020 11:27:15 - INFO - __main__ -     acc = 0.47124463519313303
07/01/2020 11:27:15 - INFO - __main__ -     loss = 1.5675792198066842

I then got an X-accuracy by averaging test accuracy over folds but it was much lower than your score. (*The standard deviation shown here was calculated over folds, not different random seeds.)

------------  -----------------  -------------
pdtb2_level2  bert-base-uncased  46.72 (±1.31)
------------  -----------------  -------------

I also tried the latest version of pytorch-transformers, but I couldn’t reproduce the result likewise.

------------  -----------------  -------------
pdtb2_level2  bert-base-uncased  46.39 (±1.11)
              xlnet-base-cased   51.78 (±1.51)
------------  -----------------  -------------

Question about if

The pdtb3_make_splits_xval function in ./preprocess [preprocess_pdtb3.py seems to confuse the cases of random split and cross validation in the code. Could you verify if this is correct?

RuntimeError when trying to train xlnet-large-uncased for the PDTB2_L1 task

Subject

When trying to run the run.py script to train an xlne-large-uncased for the PDTB2_LEVEL1 task, a strange RuntimeError occurs.

Environnment

OS: Pop!_OS 20.10 (Ubuntu based)
Python: 3.8.6
Pytorch and CUDA: 1.8.1+cu111

Expected behaviour

The script should start the training process and then evaluate at the end.

Steps to reproduce

After downloading the PDTB2.0 corpus in CSV format, I ran your preprocessing script as follows:

$ python preprocess/preprocess_pdtb2.py --data_file data/pdtb2.csv \
    --output_dir data/pdtb2_patterson_L1 \
    --split single \
    --split_name patterson \
    --label_type L1
row 40600
Label count:  {'Expansion': 8861, 'Comparison': 2503, 'Contingency': 4255, 'Temporal': 950}

After the preprocess, I ran the src/pytorch-transformers/examples/run_pdtb.py with th following parameters:

python ../src/pytorch-transformers/examples/run_pdtb.py \
    --model_type xlnet \
    --task_name PDTB2_LEVEL1 \
    --model_name_or_path xlnet-large-cased \
    --do_train \
    --evaluate_during_training \
    --do_eval \
    --data_dir ../data/pdtb2_patterson_L1/ \
    --max_seq_length 128 \
    --per_gpu_eval_batch_size 32 \
    --per_gpu_train_batch_size 8 \
    --learning_rate 2e-6 \
    --num_train_epochs 10.0 \
    --output_dir output/output_xlnet_patterson_L1  \
    --save_steps 500 \
    --logging_steps 500 \
    --seed 1 \
    --validation_metric acc \
    --n_gpu 1 \
    --cuda_no 0 \
    --deterministic

And the error message is the following:

Traceback (most recent call last):
  File "../src/pytorch-transformers/examples/run_pdtb.py", line 592, in <module>
    main()
  File "../src/pytorch-transformers/examples/run_pdtb.py", line 518, in main
    model = model_class.from_pretrained(args.model_name_or_path, from_tf=bool('.ckpt' in args.model_name_or_path), config=config)
  File "/home/aq/.local/lib/python3.8/site-packages/pytorch_transformers/modeling_utils.py", line 536, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
  File "/home/aq/.local/lib/python3.8/site-packages/pytorch_transformers/modeling_xlnet.py", line 1110, in __init__
    self.transformer = XLNetModel(config)
  File "/home/aq/.local/lib/python3.8/site-packages/pytorch_transformers/modeling_xlnet.py", line 731, in __init__
    self.word_embedding = nn.Embedding(config.n_token, config.d_model)
  File "/home/aq/.local/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 137, in __init__
    self.weight = Parameter(torch.Tensor(num_embeddings, embedding_dim))
RuntimeError: Trying to create tensor with negative dimension -1: [-1, 1024]

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.