CLJun 19, 2024

How effective is Multi-source pivoting for Translation of Low Resource Indian Languages?

arXiv:2406.13332v13 citations
Originality Incremental advance
AI Analysis

It addresses translation challenges for low-resource Indian languages, but the improvements are incremental.

This paper tackled machine translation from English to low-resource Indian languages by using a multi-source pivoting approach that incorporates both source and pivot sentences, finding it yields marginal improvements over state-of-the-art methods, which can be enhanced with synthetic target language data.

Machine Translation (MT) between linguistically dissimilar languages is challenging, especially due to the scarcity of parallel corpora. Prior works suggest that pivoting through a high-resource language can help translation into a related low-resource language. However, existing works tend to discard the source sentence when pivoting. Taking the case of English to Indian language MT, this paper explores the 'multi-source translation' approach with pivoting, using both source and pivot sentences to improve translation. We conducted extensive experiments with various multi-source techniques for translating English to Konkani, Manipuri, Sanskrit, and Bodo, using Hindi, Marathi, and Bengali as pivot languages. We find that multi-source pivoting yields marginal improvements over the state-of-the-art, contrary to previous claims, but these improvements can be enhanced with synthetic target language data. We believe multi-source pivoting is a promising direction for Low-resource translation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes