CL SD ASMay 5, 2025

Bemba Speech Translation: Exploring a Low-Resource African Language

Muhammad Hazim Al Farouq, Aman Kassahun Wassie, Yasmin Moslem

arXiv:2505.02518v36.71 citationsh-index: 8IWSLT

Originality Synthesis-oriented

AI Analysis

This work addresses speech translation for Bemba, a low-resource African language, but appears incremental as it applies existing methods to a new language without novel breakthroughs.

The authors tackled Bemba-to-English speech translation, a low-resource task, by building cascaded systems with Whisper and NLLB-200 and using data augmentation like back-translation, achieving results submitted to IWSLT 2025 but without concrete performance numbers reported.

This paper describes our system submission to the International Conference on Spoken Language Translation (IWSLT 2025), low-resource languages track, namely for Bemba-to-English speech translation. We built cascaded speech translation systems based on Whisper and NLLB-200, and employed data augmentation techniques, such as back-translation. We investigate the effect of using synthetic data and discuss our experimental setup.

View on arXiv PDF

Similar