CLOct 3, 2016

Orthographic Syllable as basic unit for SMT between Related Languages

arXiv:1610.00634v135 citations
Originality Incremental advance
AI Analysis

This addresses translation efficiency for low-resource related languages, though it is an incremental improvement over existing unit-based methods.

The paper tackled machine translation between related languages with abugida/alphabetic scripts by using orthographic syllables as the basic unit, showing this approach significantly outperformed word, morpheme, and character-based models when trained on small parallel corpora.

We explore the use of the orthographic syllable, a variable-length consonant-vowel sequence, as a basic unit of translation between related languages which use abugida or alphabetic scripts. We show that orthographic syllable level translation significantly outperforms models trained over other basic units (word, morpheme and character) when training over small parallel corpora.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes