Probing Social Identity Bias in Chinese LLMs with Gendered Pronouns and Social Groups
This work addresses bias in Chinese LLMs, which is crucial for fair deployment in user-facing applications, though it is incremental as it extends known English bias findings to a Chinese context.
The study investigated social identity bias in Chinese large language models (LLMs) by testing them with Mandarin prompts and real chatbot conversations, finding systematic ingroup-positive and outgroup-negative tendencies that intensify in real interactions.
Large language models (LLMs) are increasingly deployed in user-facing applications, raising concerns about their potential to reflect and amplify social biases. We investigate social identity framing in Chinese LLMs using Mandarin-specific prompts across ten representative Chinese LLMs, evaluating responses to ingroup ("We") and outgroup ("They") framings, and extending the setting to 240 social groups salient in the Chinese context. To complement controlled experiments, we further analyze Chinese-language conversations from a corpus of real interactions between users and chatbots. Across models, we observe systematic ingroup-positive and outgroup-negative tendencies, which are not confined to synthetic prompts but also appear in naturalistic dialogue, indicating that bias dynamics might strengthen in real interactions. Our study provides a language-aware evaluation framework for Chinese LLMs, demonstrating that social identity biases documented in English generalize cross-linguistically and intensify in user-facing contexts.