QE-fusion

Don't Rank, Combine! Combining Machine Translation Hypotheses Using Quality Estimation
Giorgos Vernikos, Andrei Popescu-Belis

Overview

QE-fusion⚛ is a novel approach that uses Quality Estimation (QE) metrics to improve translations by fusing different translation candidates. QE-fusion leverages the potential complementarity of a candidate pool by spotting variations among candidates and using a QE metric to select the spans that boost overall translation quality.

This repository contains code for sampling translations from an LLM or an MT model, implementing QE-fusion, and other reranking approaches, and scoring the corresponding outputs.

Installation

This project requires Python 3.10, PyTorch 1.13.1, and transformers 4.34.0.

It's advisable to set up a separate environment for this project and install the necessary dependencies:

conda create -n qe-fusion python=3.10
conda activate qe-fusion
pip install -r requirements.txt

Obtain Translation Hypotheses

The script assumes that the test set of each language pair (can be downloaded from sacreBLEU) is located in the folder data/<s>-<t>/ under the names test.<s> and test.<t> where <s> and <t> are the source and target languages.

The first step is to obtain multiple translation hypotheses for each input. To do so, we either use an LLM via few-shot (8-shot) learning or an MT model from which we sample k outputs using the llm_query.py script.

python llm_query.py --model llama2-7b --lp en-de --bsize 4 --decoding_alg sample --sample 5 --temperature 0.6 --exemplars 8

This script also supports greedy decoding and beam search via the decoding_alg parameter.

Fusion of Hypotheses

To fuse the generated hypotheses use the select_outputs.py script:

python select_outputs.py --model llama2-7b --lp en-de --generation sample-t0.6 --cands_pool 5 --criterion cometqe --method fusion

To implement reranking approaches such as QE-reranking and MBR modify the --criterion and --method parameters accordingly.

Evaluation

To evaluate the quality of the produced translations use the score_outs.py script:

python score_outs.py --model llama2-7b --lp en-de --cands_pool 5 --generation sample-t0.6 --criterion cometqe/cometqe-fusion-beam5-kbest0  --metrics ['bleu', 'chrf', 'comet', 'bleurt']

Reference

Please feel free to cite our paper if you use our code or proposed algorithm.:

@misc{vernikos2024dont,
      title={Don't Rank, Combine! Combining Machine Translation Hypotheses Using Quality Estimation}, 
      author={Giorgos Vernikos and Andrei Popescu-Belis},
      year={2024},
      eprint={2401.06688},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Contact

Please feel free to raise an issue or contact me in case you require any help setting up the repo!

georgevern / qe-fusion Goto Github PK

qe-fusion's Introduction

QE-fusion

Overview

Installation

Obtain Translation Hypotheses

Fusion of Hypotheses

Evaluation

Reference

Contact

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent