Mind the Gesture: Evaluating AI Sensitivity to Culturally Offensive Non-Verbal Gestures
This addresses the risk of AI perpetuating cultural offenses in global applications, which is an incremental step in AI safety.
The paper tackles the problem of AI systems misinterpreting culturally offensive non-verbal gestures, revealing that text-to-image systems show US-centric biases, large language models over-flag gestures, and vision-language models default to US-based interpretations, with performance gaps such as better detection in US contexts than non-US ones.
Gestures are an integral part of non-verbal communication, with meanings that vary across cultures, and misinterpretations that can have serious social and diplomatic consequences. As AI systems become more integrated into global applications, ensuring they do not inadvertently perpetuate cultural offenses is critical. To this end, we introduce Multi-Cultural Set of Inappropriate Gestures and Nonverbal Signs (MC-SIGNS), a dataset of 288 gesture-country pairs annotated for offensiveness, cultural significance, and contextual factors across 25 gestures and 85 countries. Through systematic evaluation using MC-SIGNS, we uncover critical limitations: text-to-image (T2I) systems exhibit strong US-centric biases, performing better at detecting offensive gestures in US contexts than in non-US ones; large language models (LLMs) tend to over-flag gestures as offensive; and vision-language models (VLMs) default to US-based interpretations when responding to universal concepts like wishing someone luck, frequently suggesting culturally inappropriate gestures. These findings highlight the urgent need for culturally-aware AI safety mechanisms to ensure equitable global deployment of AI technologies.