CLMar 31, 2021

An Exploration of Data Augmentation Techniques for Improving English to Tigrinya Translation

arXiv:2103.16789v29 citations
Originality Synthesis-oriented
AI Analysis

This addresses translation challenges for Tigrinya speakers, but it is incremental as it applies known techniques to a specific low-resource language pair.

The paper tackled the problem of low-resource neural machine translation for English to Tigrinya by exploring back-translation methods, finding that pivoting through a related higher-resource language yields substantial improvements over baselines.

It has been shown that the performance of neural machine translation (NMT) drops starkly in low-resource conditions, often requiring large amounts of auxiliary data to achieve competitive results. An effective method of generating auxiliary data is back-translation of target language sentences. In this work, we present a case study of Tigrinya where we investigate several back-translation methods to generate synthetic source sentences. We find that in low-resource conditions, back-translation by pivoting through a higher-resource language related to the target language proves most effective resulting in substantial improvements over baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes