CLJun 13, 2024

Standard Language Ideology in AI-Generated Language

arXiv:2406.08726v216 citations
Originality Synthesis-oriented
AI Analysis

This work highlights a societal problem for minoritized language communities by exposing how AI-generated language perpetuates linguistic biases, though it is incremental in offering recommendations rather than technical solutions.

The paper examines how large language models (LLMs) reflect and reinforce standard language ideology, particularly Standard American English, as the linguistic default, and proposes recommendations to address these issues for minoritized language communities.

Standard language ideology is reflected and reinforced in language generated by large language models (LLMs). We present a faceted taxonomy of open problems that illustrate how standard language ideology manifests in AI-generated language, alongside implications for minoritized language communities and society more broadly. We introduce the concept of standard AI-generated language ideology, a process through which LLMs position "standard" languages--particularly Standard American English (SAE)--as the linguistic default, reinforcing the perception that SAE is the most "appropriate" language. We then discuss ongoing tensions around what constitutes desirable system behavior, as well as advantages and drawbacks of generative AI tools attempting, or refusing, to imitate different English language varieties. Rather than prescribing narrow technical fixes, we offer three recommendations for researchers, practitioners, and funders that focus on shifting structural conditions and supporting more emancipatory outcomes for diverse language communities.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes