CLAug 31, 2018

Cognate-aware morphological segmentation for multilingual neural translation

arXiv:1808.10791v11092 citations
Originality Incremental advance
AI Analysis

This work addresses translation consistency for low-resource languages like Estonian, but it is incremental as it builds on existing methods like Morfessor and Transformer.

The authors tackled the problem of inconsistent morphological segmentation for similar words in multilingual neural translation by introducing Cognate Morfessor, a multilingual variant of Morfessor, and showed that it improved translation quality, especially for Estonian with fewer training resources.

This article describes the Aalto University entry to the WMT18 News Translation Shared Task. We participate in the multilingual subtrack with a system trained under the constrained condition to translate from English to both Finnish and Estonian. The system is based on the Transformer model. We focus on improving the consistency of morphological segmentation for words that are similar orthographically, semantically, and distributionally; such words include etymological cognates, loan words, and proper names. For this, we introduce Cognate Morfessor, a multilingual variant of the Morfessor method. We show that our approach improves the translation quality particularly for Estonian, which has less resources for training the translation model.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes