CLCYJun 13, 2024

Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination

arXiv:2406.08818v377 citations
Originality Incremental advance
AI Analysis

This work highlights a problem of dialect discrimination in AI language models that can perpetuate linguistic bias against speakers of non-standard English varieties.

The study investigated linguistic bias in ChatGPT across ten English dialects, finding that models default to standard varieties and exhibit worse stereotyping, demeaning content, lack of comprehension, and condescending responses for non-standard varieties, with GPT-4 showing increased stereotyping despite improvements in other areas.

We present a large-scale study of linguistic bias exhibited by ChatGPT covering ten dialects of English (Standard American English, Standard British English, and eight widely spoken non-"standard" varieties from around the world). We prompted GPT-3.5 Turbo and GPT-4 with text by native speakers of each variety and analyzed the responses via detailed linguistic feature annotation and native speaker evaluation. We find that the models default to "standard" varieties of English; based on evaluation by native speakers, we also find that model responses to non-"standard" varieties consistently exhibit a range of issues: stereotyping (19% worse than for "standard" varieties), demeaning content (25% worse), lack of comprehension (9% worse), and condescending responses (15% worse). We also find that if these models are asked to imitate the writing style of prompts in non-"standard" varieties, they produce text that exhibits lower comprehension of the input and is especially prone to stereotyping. GPT-4 improves on GPT-3.5 in terms of comprehension, warmth, and friendliness, but also exhibits a marked increase in stereotyping (+18%). The results indicate that GPT-3.5 Turbo and GPT-4 can perpetuate linguistic discrimination toward speakers of non-"standard" varieties.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes