Vanishing Gradient Analysis

Zero

A neural machine translation system implemented by python2 + tensorflow.

Features

Multi-Process Data Loading/Processing (Problems Exist)
Multi-GPU Training/Decoding
Gradient Aggregation

Papers

We associate each paper below with a readme file link. Please click the paper link you are interested for more details.

Efficient CTC Regularization via Coarse Labels for End-to-End Speech Translation, EACL2023
Revisiting End-to-End Speech-to-Text Translation From Scratch, ICML2022
Sparse Attention with Linear Units, EMNLP2021
Edinburgh's End-to-End Multilingual Speech Translation System for IWSLT 2021, IWSLT2021 System submission
Beyond Sentence-Level End-to-End Speech Translation: Context Helps, ACL2021
On Sparsifying Encoder Outputs in Sequence-to-Sequence Models, ACL2021 Findings
Share or Not? Learning to Schedule Language-Specific Capacity for Multilingual Translation, ICLR2021
Fast Interleaved Bidirectional Sequence Generation, WMT2020
Adaptive Feature Selection for End-to-End Speech Translation, EMNLP2020 Findings
Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation, ACL2020
Improving Deep Transformer with Depth-Scaled Initialization and Merged Attention, EMNLP2019

Supported Models

RNNSearch: support LSTM, GRU, SRU, ATR, EMNLP2018, and LRN, ACL2019 models.
Deep attention: Neural Machine Translation with Deep Attention, TPAMI
CAEncoder: the context-aware recurrent encoder, see the paper, TASLP and the original source code (in Theano).
Transformer: attention is all you need
AAN: the average attention model, ACL2018 that accelerates the decoding!
Fixup: Fixup Initialization: Residual Learning Without Normalization
Relative position representation: Self-Attention with Relative Position Representations

Requirements

python2.7
tensorflow <= 1.13.2

Usage

How to use this toolkit for machine translation?

TODO:

organize the parameters and interpretations in config.
reformat and fulfill code comments
simplify and remove unecessary coding
improve rnn models

Citation

If you use the source code, please consider citing the follow paper:

@InProceedings{D18-1459,
  author = 	"Zhang, Biao
		and Xiong, Deyi
		and su, jinsong
		and Lin, Qian
		and Zhang, Huiji",
  title = 	"Simplifying Neural Machine Translation with Addition-Subtraction Twin-Gated Recurrent Networks",
  booktitle = 	"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing",
  year = 	"2018",
  publisher = 	"Association for Computational Linguistics",
  pages = 	"4273--4283",
  location = 	"Brussels, Belgium",
  url = 	"http://aclweb.org/anthology/D18-1459"
}

If you are interested in the CAEncoder model, please consider citing our TASLP paper:

@article{Zhang:2017:CRE:3180104.3180106,
 author = {Zhang, Biao and Xiong, Deyi and Su, Jinsong and Duan, Hong},
 title = {A Context-Aware Recurrent Encoder for Neural Machine Translation},
 journal = {IEEE/ACM Trans. Audio, Speech and Lang. Proc.},
 issue_date = {December 2017},
 volume = {25},
 number = {12},
 month = dec,
 year = {2017},
 issn = {2329-9290},
 pages = {2424--2432},
 numpages = {9},
 url = {https://doi.org/10.1109/TASLP.2017.2751420},
 doi = {10.1109/TASLP.2017.2751420},
 acmid = {3180106},
 publisher = {IEEE Press},
 address = {Piscataway, NJ, USA},
}

Reference

When developing this repository, I referred to the following projects:

Contact

For any questions or suggestions, please feel free to contact Biao Zhang

bzhanggo / zero Goto Github PK

zero's Introduction

Zero

Features

Papers

Supported Models

Requirements

Usage

TODO:

Citation

Reference

Contact

zero's People

Contributors

Stargazers

Watchers

Forkers

zero's Issues

Recommend Projects

Recommend Topics

Recommend Org