MA CL LGOct 1, 2025

Stochastic Self-Organization in Multi-Agent Systems

Nurbek Tastan, Samuel Horvath, Karthik Nandakumar

arXiv:2510.00685v14 citationsh-index: 8

Originality Incremental advance

AI Analysis

This work addresses the challenge of efficient agent collaboration for tasks beyond single LLM capabilities, though it appears incremental as it builds on existing multi-agent systems with a novel communication mechanism.

The paper tackles the problem of optimizing collaboration in multi-agent systems based on Large Language Models by introducing a response-conditioned framework that adapts communication dynamically, achieving robust performance with significant gains in weak LLM regimes where prior methods fail.

Multi-agent systems (MAS) based on Large Language Models (LLMs) have the potential to solve tasks that are beyond the reach of any single LLM. However, this potential can only be realized when the collaboration mechanism between agents is optimized. Specifically, optimizing the communication structure between agents is critical for fruitful collaboration. Most existing approaches rely on fixed topologies, pretrained graph generators, optimization over edges, or employ external LLM judges, thereby adding to the complexity. In this work, we introduce a response-conditioned framework that adapts communication on-the-fly. Agents independently generate responses to the user query and assess peer contributions using an approximation of the Shapley value. A directed acyclic graph (DAG) is then constructed to regulate the propagation of the responses among agents, which ensures stable and efficient message transmission from high-contributing agents to others. This graph is dynamically updated based on the agent responses from the previous collaboration round. Since the proposed framework enables the self-organization of agents without additional supervision or training, we refer to it as SelfOrg. The SelfOrg framework goes beyond task- and query-level optimization and takes into account the stochastic nature of agent responses. Experiments with both strong and weak LLM backends demonstrate robust performance, with significant gains in the weak regime where prior methods collapse. We also theoretically show that multiple agents increase the chance of correctness and that the correct responses naturally dominate the information flow.

View on arXiv PDF

Similar