CLAug 13, 2025

From Charts to Fair Narratives: Uncovering and Mitigating Geo-Economic Biases in Chart-to-Text

Ridwan Mahbub, Mohammed Saidul Islam, Mir Tafseer Nayeem, Md Tahmid Rahman Laskar, Mizanur Rahman, Shafiq Joty, Enamul Hoque

arXiv:2508.09450v12 citationsh-index: 61Has CodeEMNLP

Originality Incremental advance

AI Analysis

This addresses fairness issues in AI-generated content for users relying on automated chart summaries, though it is incremental as it focuses on identifying and mitigating a specific bias type.

The paper investigates geo-economic biases in chart-to-text generation by large Vision-Language Models, finding that models like GPT-4o-mini and Gemini-1.5-Flash produce more positive summaries for high-income countries across 6,000 chart-country pairs, and shows that prompt-based debiasing techniques are only partially effective.

Charts are very common for exploring data and communicating insights, but extracting key takeaways from charts and articulating them in natural language can be challenging. The chart-to-text task aims to automate this process by generating textual summaries of charts. While with the rapid advancement of large Vision-Language Models (VLMs), we have witnessed great progress in this domain, little to no attention has been given to potential biases in their outputs. This paper investigates how VLMs can amplify geo-economic biases when generating chart summaries, potentially causing societal harm. Specifically, we conduct a large-scale evaluation of geo-economic biases in VLM-generated chart summaries across 6,000 chart-country pairs from six widely used proprietary and open-source models to understand how a country's economic status influences the sentiment of generated summaries. Our analysis reveals that existing VLMs tend to produce more positive descriptions for high-income countries compared to middle- or low-income countries, even when country attribution is the only variable changed. We also find that models such as GPT-4o-mini, Gemini-1.5-Flash, and Phi-3.5 exhibit varying degrees of bias. We further explore inference-time prompt-based debiasing techniques using positive distractors but find them only partially effective, underscoring the complexity of the issue and the need for more robust debiasing strategies. Our code and dataset are publicly available here.

View on arXiv PDF

Similar