Giter Club home page Giter Club logo

maps-mt's Introduction

Logo

🗺️ MAPS: Multi-Aspect Prompting and Selection

Implementaion of our paper:

Exploring Human-Like Translation Strategy with Large Language Models

🔥 Update

  • [March 19, 2024]: Accepted to TACL 2024!
  • [June 14, 2023]: This work now has a Demo. Try it!
  • [June 10, 2023]: interactive.py now enables running of MAPS-mt in an interactive mode.
  • [June 10, 2023]: We now support translation between any pair of these languages: English, Chinese, Japanese, French, and German.

MAPS

Motivation

intro

The difference between machine and human translation in an English-Chinese example. Typical neural machine translation is a source-target mapping process, while human translators can take complex steps to ensure the quality and accuracy of the translation.



Framework

method

MAPS aims to enable LLMs to mimic the human translation process by multi-aspect prompting and selection.


Dependencies

  • Download COMET and BLEURT checkpoints:

    wget https://unbabel-experimental-models.s3.amazonaws.com/comet/wmt21/wmt21-comet-qe-da.tar.gz
    tar -xf wmt21-comet-qe-da.tar.gz -C eval_ckpt/
    
    wget https://storage.googleapis.com/bleurt-oss-21/BLEURT-20.zip .
    unzip -d eval_ckpt/ BLEURT-20.zip
  • Create conda env

    conda create -n maps -c pytorch -c nvidia python==3.8.13 krb5 git pytorch==2.0.0 pytorch-cuda=11.7
  • Install other python packages

    pip3 install -r requirements.txt
    

Reproduce the main results

Run

Preparation

  • Set your openai API_KEY in model/openai/translate.py
  • Set Alpaca checkpoint file in run-maps-alpaca.sh and run-translation-alpaca.sh

Run MAPS

  • text-davinci-003: sh run-maps.sh
  • Alpaca: sh run-maps-alpaca.sh

Run other methods

  • text-davinci-003: sh run-translation-003.sh
  • Alpaca: sh run-translation-alpaca.sh

Note: The translation results have already been generated and saved in the output directory. Therefore, the scripts won't repeat the inference. If you want to regenerate the results, simply delete the contents within the output directory.


Evaluation

sh run-evaluation.sh > evaluation.log


Interactive

If you just want to have a try, you can try the interactive script like this without need of GPU or CUDA (only text-davinci-003 now):

# Preparation
wget https://unbabel-experimental-models.s3.amazonaws.com/comet/wmt21/wmt21-comet-qe-da.tar.gz
tar -xf wmt21-comet-qe-da.tar.gz -C eval_ckpt/   
conda create -n maps -c pytorch python==3.8.13 pytorch==2.0.0  
conda activate maps
pip3 install -r requirements.txt
# Interactive
(maps) zwhe@zhiweideMacBook-Pro MAPS-mt % python3 interactive.py --lang-pair en-zh

Enter source English sentence: Joint Aid for Dogs is a high specification joint and muscle supplement with glucosamine for dogs, designed to aid freedom of movement.

Output:

method

Remember to set your openai API_KEY in model/openai/translate.py. You can also take a look at the demo website.

Citation

@article{he2023exploring,
    author = {He, Zhiwei and Liang, Tian and Jiao, Wenxiang and Zhang, Zhuosheng and Yang, Yujiu and Wang, Rui and Tu, Zhaopeng and Shi, Shuming and Wang, Xing},
    title = "{Exploring Human-Like Translation Strategy with Large Language
                    Models}",
    journal = {Transactions of the Association for Computational Linguistics},
    volume = {12},
    pages = {229-246},
    year = {2024},
    month = {03},
    issn = {2307-387X},
    doi = {10.1162/tacl_a_00642},
    url = {https://doi.org/10.1162/tacl\_a\_00642},
    eprint = {https://direct.mit.edu/tacl/article-pdf/doi/10.1162/tacl\_a\_00642/2346100/tacl\_a\_00642.pdf},
}

maps-mt's People

Contributors

deyangkong avatar skytliang avatar zwhe99 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

maps-mt's Issues

Trailing comma after `TOPICS`

I noticed that there was a trailing comma after TOPICS in format_ask_topic.py. This will make TOPICS a single-item tuple consisting of a five-item list. As a result, the prompts for asking topics will be a (wrong) 1-shot example with mismatched topic keywords, instead of 5-shot.

A quick question about BLUERT and Tools

First of all, congratulations on your work being accepted by TACL!
I have some questions:

  1. Implementation of the BLEURT metric

I directly downloaded the evaluation model from the official BLEURT repository and installed the corresponding packages. Using the example code, I evaluated a translation, as follows:

`from bleurt import score
references_list = read('wmt22.en-zh.zh')
candidates_list = read('wmt22.en-zh.zh.maps.0-seed.trans')
checkpoint = "bleurt/bleurt/test_checkpoint"
scorer = score.BleurtScorer(checkpoint)
scores = scorer.score(references=references_list, candidates=candidates_list)

average_score = sum(scores) / len(scores)
print("Average BLEURT score:", average_score)`

However, the score was only 0.57. Is this evaluation process consistent with the one in your paper? Could there be something I've overlooked that resulted in this poor score?

  1. Graphical Tools
    Additionally, I am very curious about which tools you used to create the charts in your paper.

illegal hardware instruction ?

I have a problem with "interactive.py" and when I run it as instructed, I get the error: "zsh: illegal hardware instruction python3 interactive.py "

After checking, I found that the error came from: "from comet import load_from_checkpoint, download_model", but after reinstalling the comet library, the problem still exists.

Please help. Thanks in advance.

Error in interactive.py

I am using interactive.py and done preparation as told. Getting below error:
ImportError: cannot import name 'demo_ex_dict' from 'data' (unknown location)

I don't know what is this data package, not able to find it in the repo. Please help.

Thanks in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.