CLAIMar 26

CRAFT: Grounded Multi-Agent Coordination Under Partial Information

arXiv:2603.2526894.22 citationsh-index: 6Has Code
AI Analysis

This addresses a fundamental challenge in AI for multi-agent systems, though it is incremental as it provides a diagnostic framework rather than a solution.

The paper tackles the problem of multi-agent coordination under partial information by introducing CRAFT, a benchmark for evaluating pragmatic communication in large language models, and finds that stronger reasoning ability does not reliably improve coordination, with smaller models often matching or outperforming frontier systems.

We introduce CRAFT, a multi-agent benchmark for evaluating pragmatic communication in large language models under strict partial information. In this setting, multiple agents with complementary but incomplete views must coordinate through natural language to construct a shared 3D structure that no single agent can fully observe. We formalize this problem as a multi-sender pragmatic reasoning task and provide a diagnostic framework that decomposes failures into spatial grounding, belief modeling and pragmatic communication errors, including a taxonomy of behavioral failure profiles in both frontier and open-weight models. Across a diverse set of models, including 8 open-weight and 7 frontier including reasoning models, we find that stronger reasoning ability does not reliably translate to better coordination: smaller open-weight models often match or outperform frontier systems, and improved individual communication does not guarantee successful collaboration. These results suggest that multi-agent coordination remains a fundamentally unsolved challenge for current language models. Our code can be found at https://github.com/csu-signal/CRAFT

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes