CLNov 12, 2019

Character-based NMT with Transformer

Rohit Gupta, Laurent Besacier, Marc Dymetman, Matthias Gallé

arXiv:1911.04997v12.424 citations

Originality Incremental advance

AI Analysis

This work addresses the robustness issue in NMT for noisy or out-of-domain text, though it is incremental as it builds on known techniques.

The paper tackled the performance gap of character-based neural machine translation (NMT) compared to BPE-based models by applying the Transformer architecture, showing that character-based models are more robust to noisy text and domain shifts, with comparable BLEU scores achieved on clean, in-domain data using deeper models.

Character-based translation has several appealing advantages, but its performance is in general worse than a carefully tuned BPE baseline. In this paper we study the impact of character-based input and output with the Transformer architecture. In particular, our experiments on EN-DE show that character-based Transformer models are more robust than their BPE counterpart, both when translating noisy text, and when translating text from a different domain. To obtain comparable BLEU scores in clean, in-domain data and close the gap with BPE-based models we use known techniques to train deeper Transformer models.

View on arXiv PDF

Similar