Universal Vector Neural Machine Translation With Effective Attention
This work addresses the efficiency and scalability issues in multi-language translation systems for NLP practitioners, though it appears incremental as it builds on existing encoder-decoder and attention frameworks.
The authors tackled the problem of needing separate models for each language pair in neural machine translation by proposing a universal model that can handle multiple languages, and they introduced an attention mechanism with an overall learning vector to improve performance. They reported that this approach reduces the number of models required for multi-language applications.
Neural Machine Translation (NMT) leverages one or more trained neural networks for the translation of phrases. Sutskever introduced a sequence to sequence based encoder-decoder model which became the standard for NMT based systems. Attention mechanisms were later introduced to address the issues with the translation of long sentences and improving overall accuracy. In this paper, we propose a singular model for Neural Machine Translation based on encoder-decoder models. Most translation models are trained as one model for one translation. We introduce a neutral/universal model representation that can be used to predict more than one language depending on the source and a provided target. Secondly, we introduce an attention model by adding an overall learning vector to the multiplicative model. With these two changes, by using the novel universal model the number of models needed for multiple language translation applications are reduced.