LLMs Reproduce Stereotypes of Sexual and Gender Minorities
This addresses representational harms for sexual and gender minorities in AI systems, highlighting a critical gap in bias research, but it is incremental as it extends existing bias frameworks to new groups.
The paper studied biases of large language models (LLMs) towards sexual and gender minorities beyond binary categories, finding that LLMs reproduce and amplify negative stereotypes in both survey responses and text generation, such as in creative writing.
A large body of research has found substantial gender bias in NLP systems. Most of this research takes a binary, essentialist view of gender: limiting its variation to the categories _men_ and _women_, conflating gender with sex, and ignoring different sexual identities. But gender and sexuality exist on a spectrum, so in this paper we study the biases of large language models (LLMs) towards sexual and gender minorities beyond binary categories. Grounding our study in a widely used social psychology model -- the Stereotype Content Model -- we demonstrate that English-language survey questions about social perceptions elicit more negative stereotypes of sexual and gender minorities from both humans and LLMs. We then extend this framework to a more realistic use case: text generation. Our analysis shows that LLMs generate stereotyped representations of sexual and gender minorities in this setting, showing that they amplify representational harms in creative writing, a widely advertised use for LLMs.