CLMar 31, 2020

Assessing Human Translations from French to Bambara for Machine Learning: a Pilot Study

Michael Leventhal, Allahsera Tapo, Sarah Luger, Marcos Zampieri, Christopher M. Homan

arXiv:2004.00068v10.52 citations

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of creating high-quality training data for machine translation in low-resource language settings, though it is incremental as it focuses on assessment methods rather than new translation models.

The study tackled the problem of evaluating human-translated aligned texts for machine translation models in under-resourced languages like Bambara, finding that similar quality can be achieved from written or oral translations for certain texts and identifying specific instructions to improve translator work.

We present novel methods for assessing the quality of human-translated aligned texts for learning machine translation models of under-resourced languages. Malian university students translated French texts, producing either written or oral translations to Bambara. Our results suggest that similar quality can be obtained from either written or spoken translations for certain kinds of texts. They also suggest specific instructions that human translators should be given in order to improve the quality of their work.

View on arXiv PDF

Similar