Giter Club home page Giter Club logo

mrs's Introduction

Code for Multilingual Reply Suggestion (MRS)

This repository contains the code for our ACL 2021 paper:

Mozhi Zhang, Wei Wang, Budhaditya Deb, Guoqing Zheng, Milad Shokouhi, Ahmed Hassan Awadallah. A Dataset and Baselines for Multilingual Reply Suggestion.

If you find the repository useful, please cite:

@inproceedings{zhang-2021-mrs,
    title = {A Dataset and Baselines for Multilingual Reply Suggestion},
    author = {Mozhi Zhang and Wei Wang and Budhaditya Deb and Guoqing Zheng and Milad Shokouhi and Ahmed Hassan Awadallah},
    booktitle = {Proceedings of the Association for Computational Linguistics},
    doi = "10.18653/v1/2021.acl-long.97",
    year = {2021}
}

The code has three parts: an evaluation script (eval.py), retrieval model training (retrieval_rs), and generation model training (generation_rs).

Evaluate Models

Run the following to install dependencies:

pip install -r requirements.txt
python -c 'import nltk; nltk.download("punkt")'

The evaluation script eval.py takes a single argument:

python eval.py PRED_FILE

PRED_FILE is a TSV file. Each line is an example with the following columns:

Message <tab> Reference Reply <tab> Predicted Reply 1 <tab> Predicted Reply 2 <tab> Predicted Reply 3

For Japanese, add --ja to use the Japanese tokenizer.

Train Retrieval Models

  1. Install dependencies: pip install -r retrieval_rs/requirements.txt
  2. (Optional) Install Apex
  3. Download multilingual BERT model from huggingface
  4. Use retrieval_rs/train.sh to train the model. You need to set the paths in the scripts.
  5. Use retrieval_rs/test.sh to generate predictions and evaluate the model. You need to set the paths in the scripts.

Train Generation Models

  1. Install Unicoder for generation
  2. Download Unicoder-xDAE model
  3. Use generation_rs/preprocess.sh to preprocess the model. You need to set the paths in the scripts.
  4. Use generation_rs/train.sh to train the model. You need to set the paths in the scripts.
  5. Use generation_rs/test.sh to generate predictions and evaluate the model. You need to set the paths in the scripts.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

mrs's People

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.