AIMAMar 10

Chaotic Dynamics in Multi-LLM Deliberation

arXiv:2603.09127v155.0h-index: 12
Predicted impact top 67% in AI · last 90 daysOriginality Incremental advance
AI Analysis

This addresses stability issues in multi-LLM governance systems, which is an incremental but important step for ensuring reliable collective AI decision-making.

The paper tackles the problem of instability in multi-LLM deliberation systems by modeling them as random dynamical systems and quantifying sensitivity using empirical Lyapunov exponents, finding that role differentiation and model heterogeneity cause elevated divergence even in deterministic regimes, with specific values like 0.0947 for mixed no-role committees.

Collective AI systems increasingly rely on multi-LLM deliberation, but their stability under repeated execution remains poorly characterized. We model five-agent LLM committees as random dynamical systems and quantify inter-run sensitivity using an empirical Lyapunov exponent ($\hatλ$) derived from trajectory divergence in committee mean preferences. Across 12 policy scenarios, a factorial design at $T=0$ identifies two independent routes to instability: role differentiation in homogeneous committees and model heterogeneity in no-role committees. Critically, these effects appear even in the $T=0$ regime where practitioners often expect deterministic behavior. In the HL-01 benchmark, both routes produce elevated divergence ($\hatλ=0.0541$ and $0.0947$, respectively), while homogeneous no-role committees also remain in a positive-divergence regime ($\hatλ=0.0221$). The combined mixed+roles condition is less unstable than mixed+no-role ($\hatλ=0.0519$ vs $0.0947$), showing non-additive interaction. Mechanistically, Chair-role ablation reduces $\hatλ$ most strongly, and targeted protocol variants that shorten memory windows further attenuate divergence. These results support stability auditing as a core design requirement for multi-LLM governance systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes