A Focus on Neural Machine Translation for African Languages
This work addresses the scarcity of research and resources for African language machine translation, though it is incremental as it applies existing techniques to new data.
The authors tackled the problem of machine translation for low-resourced African languages by training neural models to translate English to five South African languages, showing promising results and providing reproducible data and code.
African languages are numerous, complex and low-resourced. The datasets required for machine translation are difficult to discover, and existing research is hard to reproduce. Minimal attention has been given to machine translation for African languages so there is scant research regarding the problems that arise when using machine translation techniques. To begin addressing these problems, we trained models to translate English to five of the official South African languages (Afrikaans, isiZulu, Northern Sotho, Setswana, Xitsonga), making use of modern neural machine translation techniques. The results obtained show the promise of using neural machine translation techniques for African languages. By providing reproducible publicly-available data, code and results, this research aims to provide a starting point for other researchers in African machine translation to compare to and build upon.