HC AI CLMay 27, 2025

Can we Debias Social Stereotypes in AI-Generated Images? Examining Text-to-Image Outputs and User Perceptions

Saharsh Barve, Andy Mao, Jiayue Melissa Shi, Prerna Juneja, Koustuv Saha

arXiv:2505.20692v19.57 citationsh-index: 3

Originality Incremental advance

AI Analysis

This addresses ethical concerns about AI amplifying societal stereotypes in generated images, though it is incremental in proposing a specific detection and mitigation method.

This paper tackled the problem of social stereotypes in text-to-image (T2I) generation by proposing a bias detection rubric and Social Stereotype Index (SSI) to evaluate three major models, finding initial outputs prone to stereotypes. Through targeted prompt refinement using LLMs, they significantly reduced bias with SSI dropping by 61-69% across categories, but noted a tension where users often perceived stereotypical images as more aligned with expectations.

Recent advances in generative AI have enabled visual content creation through text-to-image (T2I) generation. However, despite their creative potential, T2I models often replicate and amplify societal stereotypes -- particularly those related to gender, race, and culture -- raising important ethical concerns. This paper proposes a theory-driven bias detection rubric and a Social Stereotype Index (SSI) to systematically evaluate social biases in T2I outputs. We audited three major T2I model outputs -- DALL-E-3, Midjourney-6.1, and Stability AI Core -- using 100 queries across three categories -- geocultural, occupational, and adjectival. Our analysis reveals that initial outputs are prone to include stereotypical visual cues, including gendered professions, cultural markers, and western beauty norms. To address this, we adopted our rubric to conduct targeted prompt refinement using LLMs, which significantly reduced bias -- SSI dropped by 61% for geocultural, 69% for occupational, and 51% for adjectival queries. We complemented our quantitative analysis through a user study examining perceptions, awareness, and preferences around AI-generated biased imagery. Our findings reveal a key tension -- although prompt refinement can mitigate stereotypes, it can limit contextual alignment. Interestingly, users often perceived stereotypical images to be more aligned with their expectations. We discuss the need to balance ethical debiasing with contextual relevance and call for T2I systems that support global diversity and inclusivity while not compromising the reflection of real-world social complexity.

View on arXiv PDF

Similar