CLAICYHCLGMay 8, 2024

"They are uncultured": Unveiling Covert Harms and Social Threats in LLM Generated Conversations

UW
arXiv:2405.05378v131 citationsh-index: 6EMNLP
Originality Incremental advance
AI Analysis

This work addresses the problem of overlooked cultural biases in LLMs for users and developers, though it is incremental as it builds on prior bias research by focusing on covert harms.

The study tackled the problem of LLMs perpetuating systemic biases by introducing CHAST metrics to evaluate covert harms in generated conversations, revealing that seven out of eight LLMs exhibited such harms, with more extreme views on non-Western concepts like caste compared to Western ones.

Large language models (LLMs) have emerged as an integral part of modern societies, powering user-facing applications such as personal assistants and enterprise applications like recruitment tools. Despite their utility, research indicates that LLMs perpetuate systemic biases. Yet, prior works on LLM harms predominantly focus on Western concepts like race and gender, often overlooking cultural concepts from other parts of the world. Additionally, these studies typically investigate "harm" as a singular dimension, ignoring the various and subtle forms in which harms manifest. To address this gap, we introduce the Covert Harms and Social Threats (CHAST), a set of seven metrics grounded in social science literature. We utilize evaluation models aligned with human assessments to examine the presence of covert harms in LLM-generated conversations, particularly in the context of recruitment. Our experiments reveal that seven out of the eight LLMs included in this study generated conversations riddled with CHAST, characterized by malign views expressed in seemingly neutral language unlikely to be detected by existing methods. Notably, these LLMs manifested more extreme views and opinions when dealing with non-Western concepts like caste, compared to Western ones such as race.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes