AIOct 8, 2025

AgentAsk: Multi-Agent Systems Need to Ask

Bohan Lin, Kuo Yang, Yingchuan Lai, Yudong Zhang, Chen Zhang, Guibin Zhang, Xinlei Yu, Miao Yu, Xu Wang, Yang Wang

arXiv:2510.07593v19.62 citationsh-index: 7

Originality Incremental advance

AI Analysis

This addresses reliability issues in LLM-based multi-agent systems for tasks like math, reasoning, and coding, offering a scalable solution, though it is incremental as it builds on existing multi-agent frameworks.

The paper tackles the problem of error propagation in multi-agent systems built on large language models, which often underperform single-agent baselines, and proposes AgentAsk, a clarification module that improves accuracy and robustness across benchmarks with minimal overhead, such as latency and extra cost under 5%.

Multi-agent systems built on large language models (LLMs) promise enhanced problem-solving capabilities through collaborative division of labor. However, they frequently underperform single-agent baselines due to edge-level error cascades: minor inaccuracies at one message handoff propagate across the entire chain. We propose AgentAsk, a lightweight and plug-and-play clarification module that treats every inter-agent message as a potential failure point and inserts minimally necessary questions to arrest error propagation. AgentAsk follows a three-stage pipeline: (i) distilling edge-level judgments from curated failure traces into a compact policy, (ii) supervising the policy to determine when/what/whom/how to ask, and (iii) optimizing online with E-GRPO, a reinforcement learning objective that balances accuracy, latency, and cost. The module is architecture-agnostic and easy to integrate into existing orchestration. Across math, reasoning, and coding benchmarks, AgentAsk consistently improves accuracy and robustness over public multi-agent implementations while keeping overhead minimal, with latency and extra cost all less than 5%, approaching the performance of a strong evaluator. Beyond empirical improvements, we contribute a principled taxonomy of edge-level errors and a practical recipe for link-local intervention, offering a scalable pathway toward more reliable LLM-based multi-agent systems.

View on arXiv PDF

Similar