A Character-Level Decoder without Explicit Segmentation for Neural Machine Translation
This addresses a fundamental limitation in machine translation systems by enabling character-level generation, which could improve handling of morphologically rich languages, though it is an incremental advancement over existing neural methods.
The paper tackles the problem of neural machine translation by proposing a character-level decoder without explicit segmentation, showing it outperforms subword-level decoders on four language pairs and achieves state-of-the-art or comparable results to non-neural systems.
The existing machine translation systems, whether phrase-based or neural, have relied almost exclusively on word-level modelling with explicit segmentation. In this paper, we ask a fundamental question: can neural machine translation generate a character sequence without any explicit segmentation? To answer this question, we evaluate an attention-based encoder-decoder with a subword-level encoder and a character-level decoder on four language pairs--En-Cs, En-De, En-Ru and En-Fi-- using the parallel corpora from WMT'15. Our experiments show that the models with a character-level decoder outperform the ones with a subword-level decoder on all of the four language pairs. Furthermore, the ensembles of neural models with a character-level decoder outperform the state-of-the-art non-neural machine translation systems on En-Cs, En-De and En-Fi and perform comparably on En-Ru.