Do language models practice what they preach? Examining language ideologies about gendered language reform encoded in LLMs
This research addresses the problem of implicit political bias and inconsistency in LLMs for users and developers concerned with value alignment in AI systems, though it is incremental in scope.
The study examined language ideologies in LLMs through a case study on English gendered language reform, finding political bias where LLMs align with conservative values in seemingly non-political contexts and internal inconsistency in using gender-neutral variants based on metalinguistic context.
We study language ideologies in text produced by LLMs through a case study on English gendered language reform (related to role nouns like congressperson/-woman/-man, and singular they). First, we find political bias: when asked to use language that is "correct" or "natural", LLMs use language most similarly to when asked to align with conservative (vs. progressive) values. This shows how LLMs' metalinguistic preferences can implicitly communicate the language ideologies of a particular political group, even in seemingly non-political contexts. Second, we find LLMs exhibit internal inconsistency: LLMs use gender-neutral variants more often when more explicit metalinguistic context is provided. This shows how the language ideologies expressed in text produced by LLMs can vary, which may be unexpected to users. We discuss the broader implications of these findings for value alignment.