CLJul 1, 2025

Mathematics Isn't Culture-Free: Probing Cultural Gaps via Entity and Scenario Perturbations

arXiv:2507.00883v23 citationsh-index: 6
Originality Synthesis-oriented
AI Analysis

This work addresses cultural fairness in AI for global users, though it is incremental as it adapts existing benchmarks and methods.

The study tackled the problem of cultural bias in mathematical problem presentation by creating culturally adapted versions of the GSM8K test set for five regions and evaluating six large language models, revealing a consistent performance gap where models performed best on the original US-centric dataset and worse on adapted versions, with reasoning-capable models showing more resilience.

Although mathematics is often considered culturally neutral, the way mathematical problems are presented can carry implicit cultural context. Existing benchmarks like GSM8K are predominantly rooted in Western norms, including names, currencies, and everyday scenarios. In this work, we create culturally adapted variants of the GSM8K test set for five regions Africa, India, China, Korea, and Japan using prompt-based transformations followed by manual verification. We evaluate six large language models (LLMs), ranging from 8B to 72B parameters, across five prompting strategies to assess their robustness to cultural variation in math problem presentation. Our findings reveal a consistent performance gap: models perform best on the original US-centric dataset and comparatively worse on culturally adapted versions. However, models with reasoning capabilities are more resilient to these shifts, suggesting that deeper reasoning helps bridge cultural presentation gaps in mathematical tasks

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes