Belief Engine: Configurable and Inspectable Stance Dynamics in Multi-Agent LLM Deliberation

Joshua C. Yang, Maurice Flechtner, Damian Dailisan, Michiel A. Bakker

arXiv:2605.1534381.0

Predicted impact top 36% in AI · last 90 daysOriginality Incremental advance

AI Analysis

For researchers studying multi-agent deliberation, this provides a principled, interpretable framework to disentangle evidence uptake from other factors like anchoring, addressing the black-box nature of stance changes in LLM simulations.

The Belief Engine introduces an auditable belief-update layer for LLM-based agents that models stance changes as evidence-driven updates, enabling configurable and inspectable deliberation dynamics. On the DEBATE dataset, it best reconstructs participants whose final stance follows extracted evidence, while stable or evidence-opposed cases are attributed to anchoring or external factors.

LLM-based agents are increasingly used to simulate deliberative interactions such as negotiation, conflict resolution, and multi-turn opinion exchange. Yet generated transcripts often do not reveal why an agent's stance changes: movement may reflect evidence uptake, anchoring, role drift, echoing, or changed prompt and retrieval context. We introduce the Belief Engine (BE), an auditable belief-update layer that treats "belief" as an evidential state over a proposition and exposes it as scalar stance. BE extracts arguments into structured memory and updates stance with a log-odds rule controlled by evidence uptake u and prior anchoring a. Across multiple base LLMs, parameter sweeps show that these controls reliably shape stance dynamics while preserving an evidence-level update trail. On DEBATE, a human deliberation dataset with pre/post opinions, BE best reconstructs participants whose final stance follows extracted evidence; stable and evidence-opposed cases instead point to anchoring or factors outside the extracted evidence stream. BE provides configurable infrastructure for studying evidence-grounded deliberation, where openness, commitment, convergence, and disagreement can be tied to explicit update assumptions rather than hidden prompt effects.

View on arXiv PDF

Similar