CVAICYHCJun 12, 2024

Refusal as Silence: Gendered Disparities in Vision-Language Model Responses

arXiv:2406.08222v32 citations
Originality Incremental advance
AI Analysis

It addresses algorithmic fairness by highlighting identity-driven disparities in AI systems, which is an incremental but important step for equity audits and content moderation.

This study investigated how refusal behavior in a vision-language model (GPT-4V) varies by gender identity, finding that transgender and non-binary personas experience significantly higher refusal rates in binary gender classification tasks, even in non-harmful contexts.

Refusal behavior by Large Language Models is increasingly visible in content moderation, yet little is known about how refusals vary by the identity of the user making the request. This study investigates refusal as a sociotechnical outcome through a counterfactual persona design that varies gender identity--including male, female, non-binary, and transgender personas--while keeping the classification task and visual input constant. Focusing on a vision-language model (GPT-4V), we examine how identity-based language cues influence refusal in binary gender classification tasks. We find that transgender and non-binary personas experience significantly higher refusal rates, even in non-harmful contexts. Our findings also provide methodological implications for equity audits and content analysis using LLMs. Our findings underscore the importance of modeling identity-driven disparities and caution against uncritical use of AI systems for content coding. This study advances algorithmic fairness by reframing refusal as a communicative act that may unevenly regulate epistemic access and participation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes