CLAIOct 16, 2024

Conformity in Large Language Models

Cambridge
arXiv:2410.12428v221 citationsh-index: 10ACL
AI Analysis

This addresses the problem of LLMs compromising effectiveness in information-seeking and decision-making tasks due to conformity bias, which is incremental as it adapts known psychological effects to AI.

The paper investigated conformity bias in large language models (LLMs) by adapting psychological experiments, finding that all tested models exhibited varying levels of conformity to incorrect majority responses across domains, with higher uncertainty increasing conformity, and proposed interventions like Devil's Advocate and Question Distillation to mitigate it.

The conformity effect describes the tendency of individuals to align their responses with the majority. Studying this bias in large language models (LLMs) is crucial, as LLMs are increasingly used in various information-seeking and decision-making tasks as conversation partners to improve productivity. Thus, conformity to incorrect responses can compromise their effectiveness. In this paper, we adapt psychological experiments to examine the extent of conformity in popular LLMs. Our findings reveal that all tested models exhibit varying levels of conformity toward the majority, regardless of their initial choice or correctness, across different knowledge domains. Notably, we are the first to show that LLMs are more likely to conform when they are more uncertain in their own prediction. We further explore factors that influence conformity, such as training paradigms and input characteristics, finding that instruction-tuned models are less susceptible to conformity, while increasing the naturalness of majority tones amplifies conformity. Finally, we propose two interventions, Devil's Advocate and Question Distillation, to mitigate conformity, providing insights into building more robust language models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes