CLApr 19, 2019

Code-Switching for Enhancing NMT with Pre-Specified Translation

Kai Song, Yue Zhang, Heng Yu, Weihua Luo, Kun Wang, Min Zhang

arXiv:1904.09107v431.51120 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses a practical need for users to specify translations in NMT systems, though it is incremental as it builds on data augmentation techniques.

The paper tackles the problem of incorporating user-provided translations into neural machine translation (NMT) without harming overall translation quality, achieving consistent improvements over existing methods by enhancing translation of constrained words while preserving unconstrained words.

Leveraging user-provided translation to constrain NMT has practical significance. Existing methods can be classified into two main categories, namely the use of placeholder tags for lexicon words and the use of hard constraints during decoding. Both methods can hurt translation fidelity for various reasons. We investigate a data augmentation method, making code-switched training data by replacing source phrases with their target translations. Our method does not change the MNT model or decoding algorithm, allowing the model to learn lexicon translations by copying source-side target words. Extensive experiments show that our method achieves consistent improvements over existing approaches, improving translation of constrained words without hurting unconstrained words.

View on arXiv PDF Code

Similar