AgentCollab: A Self-Evaluation-Driven Collaboration Paradigm for Efficient LLM Agents
This addresses efficiency and robustness challenges for developers and users of autonomous LLM agents, representing an incremental advancement in agent collaboration methods.
The paper tackles the trade-off between execution efficiency and reasoning robustness in LLM agents by introducing AgentCollab, a self-evaluation-driven collaboration framework that dynamically coordinates models of different capabilities, resulting in improved accuracy-efficiency Pareto frontiers on multi-step benchmarks.
Autonomous agents powered by large language models (LLMs) perform complex tasks through long-horizon reasoning and tool interaction, where a fundamental trade-off arises between execution efficiency and reasoning robustness. Models at different capability-cost levels offer complementary advantages: lower-cost models enable fast execution but may struggle on difficult reasoning segments, while stronger models provide more robust reasoning at higher computational cost. We present AgentCollab, a self-driven collaborative inference framework that dynamically coordinates models with different reasoning capacities during agent execution. Instead of relying on external routing modules, the framework uses the agent's own self-reflection signal to determine whether the current reasoning trajectory is making meaningful progress, and escalates control to a stronger reasoning tier only when necessary. To further stabilize long-horizon execution, we introduce a difficulty-aware cumulative escalation strategy that allocates additional reasoning budget based on recent failure signals. In our experiments, we instantiate this framework using a two-level small-large model setting. Experiments on diverse multi-step agent benchmarks show that AgentCollab consistently improves the accuracy-efficiency Pareto frontier of LLM agents.