Giter Club home page Giter Club logo

r-net's Introduction

R-Net

Requirements

There have been a lot of known problems caused by using different software versions. Please check your versions before opening issues or emailing me.

General

  • Python >= 3.4
  • unzip, wget

Python Packages

  • tensorflow-gpu >= 1.5.0
  • spaCy >= 2.0.0
  • tqdm
  • ujson

Usage

To download and preprocess the data, run

# download SQuAD and Glove
sh download.sh
# preprocess the data
python config.py --mode prepro

Hyper parameters are stored in config.py. To debug/train/test the model, run

python config.py --mode debug/train/test

To get the official score, run

python evaluate-v1.1.py ~/data/squad/dev-v1.1.json log/answer/answer.json

The default directory for tensorboard log file is log/event

Detailed Implementaion

  • The original paper uses additive attention, which consumes lots of memory. This project adopts scaled multiplicative attention presented in Attention Is All You Need.
  • This project adopts variational dropout presented in A Theoretically Grounded Application of Dropout in Recurrent Neural Networks.
  • To solve the degradation problem in stacked RNN, outputs of each layer are concatenated to produce the final output.
  • When the loss on dev set increases in a certain period, the learning rate is halved.
  • During prediction, the project adopts search method presented in Machine Comprehension Using Match-LSTM and Answer Pointer.
  • To address efficiency issue, this implementation uses bucketing method (contributed by xiongyifan) and CudnnGRU. The bucketing method can speedup training, but will lower the F1 score by 0.3%.

Performance

Score

EM F1
original paper 71.1 79.5
this project 71.07 79.51

Training Time (s/it)

Native Native + Bucket Cudnn Cudnn + Bucket
E5-2640 6.21 3.56 - -
TITAN X 2.56 1.31 0.41 0.28

Extensions

These settings may increase the score but not used in the model by default. You can turn these settings on in config.py.

r-net's People

Contributors

xiongyifan avatar zhaodaolimeng avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.