Improved English to Russian Translation by Neural Suffix Prediction
This addresses translation quality issues for morphologically rich languages like Russian, though it is incremental as it builds on existing NMT architectures.
The paper tackled the problem of neural machine translation performance deficiency for morphologically rich languages by proposing a method that predicts stems and suffixes separately during decoding, achieving an improvement of up to 1.98 BLEU on English to Russian translation.
Neural machine translation (NMT) suffers a performance deficiency when a limited vocabulary fails to cover the source or target side adequately, which happens frequently when dealing with morphologically rich languages. To address this problem, previous work focused on adjusting translation granularity or expanding the vocabulary size. However, morphological information is relatively under-considered in NMT architectures, which may further improve translation quality. We propose a novel method, which can not only reduce data sparsity but also model morphology through a simple but effective mechanism. By predicting the stem and suffix separately during decoding, our system achieves an improvement of up to 1.98 BLEU compared with previous work on English to Russian translation. Our method is orthogonal to different NMT architectures and stably gains improvements on various domains.