CLJan 9

Afri-MCQA: Multimodal Cultural Question Answering for African Languages

arXiv:2601.05699v22 citationsh-index: 39Has Code
Originality Synthesis-oriented
AI Analysis

This addresses the problem of AI inclusivity for African language speakers by providing a new benchmark, though it is incremental as it focuses on dataset creation and evaluation rather than novel methods.

The authors tackled the underrepresentation of African languages in AI by creating Afri-MCQA, a multimodal cultural question-answering benchmark covering 15 languages, and found that large language models perform poorly, with near-zero accuracy on open-ended visual question answering in native languages or speech.

Africa is home to over one-third of the world's languages, yet remains underrepresented in AI research. We introduce Afri-MCQA, the first Multilingual Cultural Question-Answering benchmark covering 7.5k Q&A pairs across 15 African languages from 12 countries. The benchmark offers parallel English-African language Q&A pairs across text and speech modalities and was entirely created by native speakers. Benchmarking large language models (LLMs) on Afri-MCQA shows that open-weight models perform poorly across evaluated cultures, with near-zero accuracy on open-ended VQA when queried in native language or speech. To evaluate linguistic competence, we include control experiments meant to assess this specific aspect separate from cultural knowledge, and we observe significant performance gaps between native languages and English for both text and speech. These findings underscore the need for speech-first approaches, culturally grounded pretraining, and cross-lingual cultural transfer. To support more inclusive multimodal AI development in African languages, we release our Afri-MCQA under academic license or CC BY-NC 4.0 on HuggingFace (https://huggingface.co/datasets/Atnafu/Afri-MCQA)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes