TAB (Tuning with Auxiliary Branch)

Code and pretrained models for ACL 2023 main conference short paper: "Modality Adaption or Regularization? A Case Study on End-to-End Speech Translation".

Environment Configuration

conda create -n tab python=3.6
conda activate tab
conda install pytorch==1.8.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge
# cudatoolkit=10.2 is also acceptable.

mkdir ~/TAB && cd ~/TAB && mkdir data checkpoints
git clone https://github.com/hannlp/tab Fairseq-S2T
cd Fairseq-S2T
pip install --upgrade pip
pip install --editable ./
pip install sacrebleu==1.4.13 pandas sacremoses sentencepiece configargparse gpustat tensorboard editdistance

Examples on MuST-C en-de

cd ~/TAB/Fairseq-S2T/egs/mustc/st

Data Preparation

ln -s path/to/mustc_root/ende ~/TAB/data/mustc.en-de

bash run.sh --tgt_lang de
# It may take a bit of time.

Once you have pre-processed the data, the directory should look like this:

ls ~/TAB/data/mustc.en-de/st
config_share.yaml  dev.tsv      spm_unigram10000_st_share.model  spm_unigram10000_st_share.vocab  tst-COMMON.en
dev.en             fbank80.zip  spm_unigram10000_st_share.txt    train.tsv                        tst-COMMON.tsv

Tuning with auxiliary branch

Download the pre-trained ASR and MT models and modify the parameters in conf/tab.yaml such as load-pretrained-xxx-from to match the corresponding models.
Some other parameters in the paper can be adjusted, including: use-auxiliary-branch,consistency-type,consistency-weight,replacement-probability-strategy,replacement-probability,uncertainty-gamma.
Run the following command to start training (Note that gpu_num * max_tokens * update_freq should be 320k. And exp_tag can be set to mark and distinguish different experiments).

bash run.sh --stage 1 --stop_stage 1 --tgt_lang de --train_config tab --exp_tag uncertainty_gamma0.5_alpha5 --gpu_num 4 --max_tokens 20000 --update_freq 4

The trained models will be saved at ~/TAB/checkpoints/mustc.en-de/st/{DATE}_tab_{EXP_TAG}

Evaluate

bash run.sh --stage 2 --stop_stage 2 --tgt_lang de --exp_name {DATE}_tab_{EXP_TAG} --gpu_num 1 --test_subset dev,tst-COMMON

Contact and Citation

To keep things simple, we have not included the scripts for the pre-training stage. If you have any questions, please do not hesitate to contact me at [email protected]. If you find this repository helpful, please cite it as:

@inproceedings{han-etal-2023-modality,
    title = "Modality Adaption or Regularization? A Case Study on End-to-End Speech Translation",
    author = "Han, Yuchen  and
      Xu, Chen  and
      Xiao, Tong  and
      Zhu, Jingbo",
    booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)",
    month = jul,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.acl-short.115",
    pages = "1340--1348",
    abstract = "Pre-training and fine-tuning is a paradigm for alleviating the data scarcity problem in end-to-end speech translation (E2E ST). The commonplace {''}modality gap{''} between speech and text data often leads to inconsistent inputs between pre-training and fine-tuning. However, we observe that this gap occurs in the early stages of fine-tuning, but does not have a major impact on the final performance. On the other hand, we find that there has another gap, which we call the {''}capacity gap{''}: high resource tasks (such as ASR and MT) always require a large model to fit, when the model is reused for a low resource task (E2E ST), it will get a sub-optimal performance due to the over-fitting. In a case study, we find that the regularization plays a more important role than the well-designed modality adaption method, which achieves 29.0 for en-de and 40.3 for en-fr on the MuST-C dataset.",
}

hannlp / tab Goto Github PK

tab's Introduction

TAB (Tuning with Auxiliary Branch)

Environment Configuration

Examples on MuST-C en-de

Data Preparation

Tuning with auxiliary branch

Evaluate

Contact and Citation

tab's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent