Translating from Morphologically Complex Languages: A Paraphrase-Based Approach
This addresses a key bottleneck in machine translation for languages with complex derivational morphology, offering a novel solution with demonstrated gains.
The paper tackles translation from morphologically complex languages by using a paraphrase-based approach to handle derivational morphology, showing significant improvements over rival methods in experiments translating from Malay to English across five automatic evaluation measures.
We propose a novel approach to translating from a morphologically complex language. Unlike previous research, which has targeted word inflections and concatenations, we focus on the pairwise relationship between morphologically related words, which we treat as potential paraphrases and handle using paraphrasing techniques at the word, phrase, and sentence level. An important advantage of this framework is that it can cope with derivational morphology, which has so far remained largely beyond the capabilities of statistical machine translation systems. Our experiments translating from Malay, whose morphology is mostly derivational, into English show significant improvements over rivaling approaches based on five automatic evaluation measures (for 320,000 sentence pairs; 9.5 million English word tokens).