Yann Le Beux

2papers

2 Papers

99.5CYMay 29
Next-Billion AI Index: The compass for AI utility and adoption in the global majority

Ambrish Rawat, Jessica He, Subhabrata Majumdar et al.

Generative AI assessments remain dominated by frontier capability benchmarks that often fail to capture whether systems can be sustainably deployed, adapted, and trusted in locally grounded and infrastructure-constrained settings. This paper introduces the Next Billion AI Index (nexbax), which we believe is the first diagnostic framework to treat economic viability, operational deployability, and governance alignment as co-equal determinants of AI utility in next-billion-user contexts. Rather than treating usefulness as a single outcome, nexbax operationalizes the preconditions for useful AI through 10 dimensions organized under three themes: Effective Efficiency, Operational Practicality, and Societal Integrity. These dimensions assess whether systems are economically viable, deployable under infrastructure and workflow constraints, and aligned with local needs, user expectations, and collaborative development practices. We pair the framework with rubrics for weak, moderate, and strong performance, and conduct a formative expert evaluation through eleven semi-structured interviews with founders, developers, product leaders, and technical practitioners building AI systems for next-billion markets. Participants found the index useful for reasoning about adoption trade-offs and effective at capturing factors shaping real-world AI uptake -- particularly cost, usability, reliability, and trust. They also identified the need for contextual explanations, domain-specific evidence, and broader stakeholder validation. Nexbax is therefore proposed not as a universal score of social value, but as a diagnostic for artificial useful intelligence: a way to make visible the technical, economic, and governance properties that make inclusive AI deployment more viable.

CLNov 27, 2025Code
AfriStereo: A Culturally Grounded Dataset for Evaluating Stereotypical Bias in Large Language Models

Yann Le Beux, Oluchi Audu, Oche D. Ankeli et al.

Existing AI bias evaluation benchmarks largely reflect Western perspectives, leaving African contexts underrepresented and enabling harmful stereotypes in applications across various domains. To address this gap, we introduce AfriStereo, the first open-source African stereotype dataset and evaluation framework grounded in local socio-cultural contexts. Through community engaged efforts across Senegal, Kenya, and Nigeria, we collected 1,163 stereotypes spanning gender, ethnicity, religion, age, and profession. Using few-shot prompting with human-in-the-loop validation, we augmented the dataset to over 5,000 stereotype-antistereotype pairs. Entries were validated through semantic clustering and manual annotation by culturally informed reviewers. Preliminary evaluation of language models reveals that nine of eleven models exhibit statistically significant bias, with Bias Preference Ratios (BPR) ranging from 0.63 to 0.78 (p <= 0.05), indicating systematic preferences for stereotypes over antistereotypes, particularly across age, profession, and gender dimensions. Domain-specific models appeared to show weaker bias in our setup, suggesting task-specific training may mitigate some associations. Looking ahead, AfriStereo opens pathways for future research on culturally grounded bias evaluation and mitigation, offering key methodologies for the AI community on building more equitable, context-aware, and globally inclusive NLP technologies.