SOC-PHAIAOSep 8, 2025

Disentangling Interaction and Bias Effects in Opinion Dynamics of Large Language Models

arXiv:2509.06858v11 citationsh-index: 36
Originality Incremental advance
AI Analysis

This work addresses the challenge of accurately modeling human opinion dynamics with LLMs for researchers and practitioners, but it is incremental as it builds on existing bias analysis with a new framework.

The paper tackled the problem of disentangling interaction effects from systematic biases in large language models (LLMs) used to simulate opinion dynamics, and found that opinion trajectories quickly converge to a shared attractor with fading interaction influence and varying bias impacts across LLMs, with fine-tuning shifting the attractor based on opinionated statements.

Large Language Models are increasingly used to simulate human opinion dynamics, yet the effect of genuine interaction is often obscured by systematic biases. We present a Bayesian framework to disentangle and quantify three such biases: (i) a topic bias toward prior opinions in the training data; (ii) an agreement bias favoring agreement irrespective of the question; and (iii) an anchoring bias toward the initiating agent's stance. Applying this framework to multi-step dialogues reveals that opinion trajectories tend to quickly converge to a shared attractor, with the influence of the interaction fading over time, and the impact of biases differing between LLMs. In addition, we fine-tune an LLM on different sets of strongly opinionated statements (incl. misinformation) and demonstrate that the opinion attractor shifts correspondingly. Exposing stark differences between LLMs and providing quantitative tools to compare them to human subjects in the future, our approach highlights both chances and pitfalls in using LLMs as proxies for human behavior.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes