qizhex / race_ar_baselines Goto Github PK

Baselines of the RACE Reading Comprehension Dataset

Python 93.31% Shell 6.69%

race_ar_baselines's Introduction

RACE Reading Comprehension Task

Code for the paper: RACE: Large-scale ReAding Comprehension Dataset From Examination. Guokun Lai*, Qizhe Xie*, Hanxiao Liu, Yiming Yang and Eduard Hovy. EMNLP 2017

Leaderboard of RACE

Datasets

RACE: Please submit a data request here. The data will be automatically sent to you. Create a "data" directory alongside "src" directory and download the data.
Word embeddings:
- glove.6B.zip: http://nlp.stanford.edu/data/glove.6B.zip

Usage

Preprocessing

* python preprocess.py

Stanford AR

* test pre-trained model: bash test_SAR.sh
* train: bash train_SAR.sh (The pre-trained model will be replaced)

GA

* test pre-trained model: bash test_GA.sh
* train: bash train_GA.sh (The pre-trained model will be replaced)

Reference

@inproceedings{lai2017large,
  title={RACE: Large-scale ReAding Comprehension Dataset From Examinations},
  author={Lai, Guokun and Xie, Qizhe and Liu, Hanxiao and Yang, Yiming and Hovy, Eduard},
  booktitle={EMNLP},
  year={2017}
}

Acknowledgement

The code is adapted from Stanford AR https://github.com/danqi/rc-cnn-dailymail and GA https://github.com/bdhingra/ga-reader

Contact

Please contact Qizhe Xie (qzxie AT cs DOT cmu DOT edu) if you find bugs or missing info

race_ar_baselines's People

Contributors

Stargazers

Watchers

race_ar_baselines's Issues

test pretrained model failed

 [root@localhost src]# bash test_GA.sh 
!!!test
WARNING (theano.configdefaults): g++ not detected ! Theano will be unable to execute optimized C-implementations (for both CPU and GPU) and will default to Python implementations. Performance will be severely degraded. To remove this warning, set Theano flags cxx to an empty string.
WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu1 is not available  (error: cuda unavailable)
/root/.pyenv/versions/yan35/lib/python3.5/site-packages/theano/tensor/signal/downsample.py:6: UserWarning: downsample module has been moved to the theano.tensor.signal.pool module.
  "downsample module has been moved to the theano.tensor.signal.pool module.")
06-27 16:35 main.py -train_file ../data/data/train -dev_file ../data/data/test -embedding_size 100 -pre_trained ../obj/model_GA.pkl.gz -test_only True -model GA -num_GA_layers 1 -hidden_size 128
06-27 16:35 --------------------------------------------------
06-27 16:35 Load data files..
06-27 16:35 ********** Train
06-27 16:35 #Examples: 87866
06-27 16:35 ********** Dev
06-27 16:35 #Examples: 4934
06-27 16:35 --------------------------------------------------
06-27 16:35 Build dictionary..
06-27 16:35 --------------------------------------------------
06-27 16:35 Embeddings: 50002 x 100
06-27 16:35 Compile functions..
Traceback (most recent call last):
  File "main.py", line 309, in <module>
    main(args)
  File "main.py", line 216, in main
    train_fn, test_fn, params, all_params = build_fn(args, embeddings)
  File "main.py", line 82, in build_fn
    rnn_layer=args.rnn_layer)
  File "/root/yanqiang/12-RACE_AR_baselines/src/nn_layers.py", line 87, in stack_rnn
    network = _rnn(True, name)
  File "/root/yanqiang/12-RACE_AR_baselines/src/nn_layers.py", line 84, in _rnn
    name=name + '_layer' + str(layer + 1))
  File "/root/.pyenv/versions/yan35/lib/python3.5/site-packages/lasagne/layers/recurrent.py", line 1180, in __init__
    super(GRULayer, self).__init__(incomings, **kwargs)
TypeError: __init__() got an unexpected keyword argument 'only_return_final'

I tried to run test_GA.sh,but it failed,can you tell me what caused this?

I tried to use the Stanford Attentive Reader on the data, testing the pretrained model viabash test_SAR.sh. I downloaded both the glove embeddings, which I put under data/embedding, and the RACE data, under data/data. However, the resulting accuracy on both development and test data is only around .36-.39, and not > .4, as stated in the paper. There is no error message as far as I can tell, the output of the script is appended. Any idea about what could have gone wrong?

Looking forward to an answer!

log.txt

the leaderboard can't to access

hello, the leaderboard can't to access, there is 404 code(The address is 'http://www.qizhexie.com//data/RACE_leaderboard').

About the AR model

Hi
I re-build a AR model with tensorflow, but the loss not decrease. i can't find the difference between your model and mine. could you public the training log?
Thank you very much!

qizhex / race_ar_baselines Goto Github PK

race_ar_baselines's Introduction

race_ar_baselines's People

Contributors

Stargazers

Watchers

Forkers

race_ar_baselines's Issues

Recommend Projects

Recommend Topics

Recommend Org