Cross-Cultural Value Awareness in Large Vision-Language Models

Phillip Howard, Xin Su, Kathleen C. Fraser

arXiv:2604.0994588.1h-index: 5

Predicted impact top 18% in CV · last 90 daysOriginality Incremental advance

AI Analysis

For researchers and developers of LVLMs, this paper highlights a previously underexplored fairness concern regarding cultural stereotypes, but the findings are incremental as they extend known bias issues to a new domain.

This work investigates how cultural contexts in images affect the value judgments of large vision-language models (LVLMs), finding that models exhibit cultural biases and lack awareness of cultural value differences. The study uses counterfactual image sets and Moral Foundations Theory to evaluate five popular LVLMs.

The rapid adoption of large vision-language models (LVLMs) in recent years has been accompanied by growing fairness concerns due to their propensity to reinforce harmful societal stereotypes. While significant attention has been paid to such fairness concerns in the context of social biases, relatively little prior work has examined the presence of stereotypes in LVLMs related to cultural contexts such as religion, nationality, and socioeconomic status. In this work, we aim to narrow this gap by investigating how cultural contexts depicted in images influence the judgments LVLMs make about a person's moral, ethical, and political values. We conduct a multi-dimensional analysis of such value judgments in five popular LVLMs using counterfactual image sets, which depict the same person across different cultural contexts. Our evaluation framework diagnoses LVLM awareness of cultural value differences through the use of Moral Foundations Theory, lexical analyses, and the sensitivity of generated values to depicted cultural contexts.

View on arXiv PDF

Similar