Modeling Homophone Noise for Robust Neural Machine Translation
This work improves the robustness of Neural Machine Translation for users dealing with homophone errors, particularly in languages like Chinese.
This paper addresses homophone noise in Neural Machine Translation (NMT) by proposing a framework with a homophone noise detector and a syllable-aware NMT model. Experiments on Chinese->English translation show significant outperformance on noisy test sets and substantial improvement on clean text.
In this paper, we propose a robust neural machine translation (NMT) framework. The framework consists of a homophone noise detector and a syllable-aware NMT model to homophone errors. The detector identifies potential homophone errors in a textual sentence and converts them into syllables to form a mixed sequence that is then fed into the syllable-aware NMT. Extensive experiments on Chinese->English translation demonstrate that our proposed method not only significantly outperforms baselines on noisy test sets with homophone noise, but also achieves a substantial improvement on clean text.