CLSep 30, 2023
AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASRTobi Olatunji, Tejumade Afonja, Aditya Yadavalli et al. · mila
Africa has a very low doctor-to-patient ratio. At very busy clinics, doctors could see 30+ patients per day -- a heavy patient burden compared with developed countries -- but productivity tools such as clinical automatic speech recognition (ASR) are lacking for these overworked clinicians. However, clinical ASR is mature, even ubiquitous, in developed nations, and clinician-reported performance of commercial clinical ASR systems is generally satisfactory. Furthermore, the recent performance of general domain ASR is approaching human accuracy. However, several gaps exist. Several publications have highlighted racial bias with speech-to-text algorithms and performance on minority accents lags significantly. To our knowledge, there is no publicly available research or benchmark on accented African clinical ASR, and speech data is non-existent for the majority of African accents. We release AfriSpeech, 200hrs of Pan-African English speech, 67,577 clips from 2,463 unique speakers across 120 indigenous accents from 13 countries for clinical and general domain ASR, a benchmark test set, with publicly available pre-trained models with SOTA performance on the AfriSpeech benchmark.
CVOct 22, 2025
Exposing Blindspots: Cultural Bias Evaluation in Generative Image ModelsHuichan Seo, Sieun Choi, Minki Hong et al.
Generative image models produce striking visuals yet often misrepresent culture. Prior work has examined cultural bias mainly in text-to-image (T2I) systems, leaving image-to-image (I2I) editors underexplored. We bridge this gap with a unified evaluation across six countries, an 8-category/36-subcategory schema, and era-aware prompts, auditing both T2I generation and I2I editing under a standardized protocol that yields comparable diagnostics. Using open models with fixed settings, we derive cross-country, cross-era, and cross-category evaluations. Our framework combines standard automatic metrics, a culture-aware retrieval-augmented VQA, and expert human judgments collected from native reviewers. To enable reproducibility, we release the complete image corpus, prompts, and configurations. Our study reveals three findings: (1) under country-agnostic prompts, models default to Global-North, modern-leaning depictions that flatten cross-country distinctions; (2) iterative I2I editing erodes cultural fidelity even when conventional metrics remain flat or improve; and (3) I2I models apply superficial cues (palette shifts, generic props) rather than era-consistent, context-aware changes, often retaining source identity for Global-South targets. These results highlight that culture-sensitive edits remain unreliable in current systems. By releasing standardized data, prompts, and human evaluation protocols, we provide a reproducible, culture-centered benchmark for diagnosing and tracking cultural bias in generative image models.
CVJun 10, 2024
CVQA: Culturally-diverse Multilingual Visual Question Answering BenchmarkDavid Romero, Chenyang Lyu, Haryo Akbarianto Wibowo et al.
Visual Question Answering (VQA) is an important task in multimodal AI, and it is often used to test the ability of vision-language models to understand and reason on knowledge present in both visual and textual data. However, most of the current VQA models use datasets that are primarily focused on English and a few major world languages, with images that are typically Western-centric. While recent efforts have tried to increase the number of languages covered on VQA datasets, they still lack diversity in low-resource languages. More importantly, although these datasets often extend their linguistic range via translation or some other approaches, they usually keep images the same, resulting in narrow cultural representation. To address these limitations, we construct CVQA, a new Culturally-diverse multilingual Visual Question Answering benchmark, designed to cover a rich set of languages and cultures, where we engage native speakers and cultural experts in the data collection process. As a result, CVQA includes culturally-driven images and questions from across 30 countries on four continents, covering 31 languages with 13 scripts, providing a total of 10k questions. We then benchmark several Multimodal Large Language Models (MLLMs) on CVQA, and show that the dataset is challenging for the current state-of-the-art models. This benchmark can serve as a probing evaluation suite for assessing the cultural capability and bias of multimodal models and hopefully encourage more research efforts toward increasing cultural awareness and linguistic diversity in this field.
CYMay 10, 2023
What We Know So Far: Artificial Intelligence in African HealthcareNaome Etori, Ebasa Temesgen, Maria Gini
Healthcare in Africa is a complex issue influenced by many factors including poverty, lack of infrastructure, and inadequate funding. However, Artificial intelligence (AI) applied to healthcare, has the potential to transform healthcare in Africa by improving the accuracy and efficiency of diagnosis, enabling earlier detection of diseases, and supporting the delivery of personalized medicine. This paper reviews the current state of how AI Algorithms can be used to improve diagnostics, treatment, and disease monitoring, as well as how AI can be used to improve access to healthcare in Africa as a low-resource setting and discusses some of the critical challenges and opportunities for its adoption. As such, there is a need for a well-coordinated effort by the governments, private sector, healthcare providers, and international organizations to create sustainable AI solutions that meet the unique needs of the African healthcare system.