CYAIFeb 5

From Bias Mitigation to Bias Negotiation: Governing Identity and Sociocultural Reasoning in Generative AI

arXiv:2602.18459v1h-index: 7
AI Analysis

This addresses the governance of identity and sociocultural reasoning in AI for justice and functionality, but it is incremental as it builds on existing bias mitigation approaches.

The paper tackles the problem of regulating identity in generative AI beyond bias mitigation, proposing 'bias negotiation' as a normative framework for managing identity-conditioned judgments, and demonstrates its feasibility through interviews with chatbots, identifying recurring negotiation repertoires and failure modes.

LLMs act in the social world by drawing upon shared cultural patterns to make social situations understandable and actionable. Because identity is often part of the inferential substrate of competent judgment, ethical alignment requires regulating when and how systems invoke identity. Yet the dominant governance regime for identity-related harm remains bias mitigation, which treats identity primarily as a source of measurable disparities or harmful associations to be detected and suppressed. This leaves underspecified a positive, context-sensitive role for identity in interpretation. We call this governance problem bias negotiation: the normative regulation of identity-conditioned judgments of sociocultural relevance, inference, and justification. Empirically, we probe the feasibility of bias negotiation through semi-structured interviews with multiple publicly deployed chatbots. We identify recurring repertoires for negotiating identity including probabilistic framing of group tendencies and harm-value balancing. We also observe failure modes in which models avoid hard tradeoffs or apply principles inconsistently. Bias negotiation matters for justice because a positive role for sociocultural reasoning is required to recognize and potentially remediate structural inequities. But it is equally implicated in core model functionality as sociocultural competence is needed for systems that operate across heterogeneous cultural contexts. Because bias negotiation is a procedural capability expressed through deliberation and interaction, it cannot be validated by static benchmarks alone. To support targeted training, we introduce a broad but explicit framework that decomposes bias negotiation into an action space of negotiation moves (what to observe and score) and a complementary set of case features (over which the model negotiates), enabling systematic test-suite design and evaluation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes