CLMar 27

AgentCollab: A Self-Evaluation-Driven Collaboration Paradigm for Efficient LLM Agents

Wenbo Gao, Renxi Liu, Xian Wang, Fang Guo, Shuai Yang, Xi Chen, Hui-Ling Zhen, Hanting Chen, Weizhe Lin, Xiaosong Li, Yaoyuan Wang

arXiv:2603.2603478.42 citationsh-index: 6

AI Analysis

This addresses efficiency and robustness challenges for developers and users of autonomous LLM agents, representing an incremental advancement in agent collaboration methods.

The paper tackles the trade-off between execution efficiency and reasoning robustness in LLM agents by introducing AgentCollab, a self-evaluation-driven collaboration framework that dynamically coordinates models of different capabilities, resulting in improved accuracy-efficiency Pareto frontiers on multi-step benchmarks.

Autonomous agents powered by large language models (LLMs) perform complex tasks through long-horizon reasoning and tool interaction, where a fundamental trade-off arises between execution efficiency and reasoning robustness. Models at different capability-cost levels offer complementary advantages: lower-cost models enable fast execution but may struggle on difficult reasoning segments, while stronger models provide more robust reasoning at higher computational cost. We present AgentCollab, a self-driven collaborative inference framework that dynamically coordinates models with different reasoning capacities during agent execution. Instead of relying on external routing modules, the framework uses the agent's own self-reflection signal to determine whether the current reasoning trajectory is making meaningful progress, and escalates control to a stronger reasoning tier only when necessary. To further stabilize long-horizon execution, we introduce a difficulty-aware cumulative escalation strategy that allocates additional reasoning budget based on recent failure signals. In our experiments, we instantiate this framework using a two-level small-large model setting. Experiments on diverse multi-step agent benchmarks show that AgentCollab consistently improves the accuracy-efficiency Pareto frontier of LLM agents.

View on arXiv PDF

Similar