On the Reliability Limits of LLM-Based Multi-Agent Planning

Ruicheng Ao, Siyang Gao, David Simchi-Levi

arXiv:2603.2699362.61 citationsh-index: 6

AI Analysis

Provides a theoretical framework to understand fundamental limits of multi-agent LLM planning for researchers and practitioners designing such systems.

This paper establishes theoretical reliability limits for LLM-based multi-agent planning by modeling it as a delegated decision network, showing that any such network is dominated by a centralized Bayes decision maker. The loss due to communication is characterized via expected posterior divergence, reducing to conditional mutual information under logarithmic loss and expected squared posterior error under the Brier score.

This technical note studies the reliability limits of LLM-based multi-agent planning as a delegated decision problem. We model the LLM-based multi-agent architecture as a finite acyclic decision network in which multiple stages process shared model-context information, communicate through language interfaces with limited capacity, and may invoke human review. We show that, without new exogenous signals, any delegated network is decision-theoretically dominated by a centralized Bayes decision maker with access to the same information. In the common-evidence regime, this implies that optimizing over multi-agent directed acyclic graphs under a finite communication budget can be recast as choosing a budget-constrained stochastic experiment on the shared signal. We also characterize the loss induced by communication and information compression. Under proper scoring rules, the gap between the centralized Bayes value and the value after communication admits an expected posterior divergence representation, which reduces to conditional mutual information under logarithmic loss and to expected squared posterior error under the Brier score. These results characterize the fundamental reliability limits of delegated LLM planning. Experiments with LLMs on a controlled problem set further demonstrate these characterizations.

View on arXiv PDF

Similar