AfriScience-MT: Towards Decolonizing Science in Africa through Text Translation
This work provides a resource and benchmark for scientific machine translation in African languages, addressing a critical gap in language technology for underrepresented languages.
The authors created a parallel corpus for six African languages across 11 scientific domains, addressing the lack of scientific terminology. They benchmarked MT systems and found GPT-5.4 and Gemini-3.1-Flash-Lite achieved the highest COMET scores (68.3 and 68.0 sentence-level), while fine-tuned NLLB-1.3B led among open-source models (67.3 sentence-level).
The dominance of colonial languages in African education and scientific communication limits how hundreds of millions of speakers of African languages access and produce scientific knowledge. A core obstacle is the lack of established scientific terminology in these languages. We introduce AfriScience-MT, a parallel corpus covering six African languages (Amharic, Hausa, Luganda, Northern Sotho, Yorùbá, and isiZulu) across 11 scientific domains. Professional translators, working with expert science communicators, translated plain-language summaries of scientific papers into each target language and created new terms where none existed. We benchmark machine translation systems and large language models in zero-shot, few-shot, and fine-tuned settings. Our results show that closed-source models outperform all open-source models at both the sentence and document levels: GPT-5.4 and Gemini-3.1-Flash-Lite lead with average sentence-level COMET scores of 68.3 and 68.0, respectively, and tie at an average document-level COMET of 48.3. Among open systems, fine-tuned NLLB-1.3B reaches 67.3 at the sentence level, and TranslateGemma-12B reaches 44.0 at the document level with 1-shot in-context learning. We release AfriScience-MT to support benchmarking and document-level scientific MT for African languages.