MonoScale: Scaling Multi-Agent System with Monotonic Improvement
This addresses a critical scalability issue for multi-agent systems in AI, though it appears incremental as an enhancement to existing routing methods.
The paper tackles the problem of performance collapse when scaling up LLM-based multi-agent systems by adding new agents, proposing MonoScale, a framework that generates familiarization tasks and uses natural-language memory to guide routing, resulting in stable performance gains as the agent pool grows on benchmarks like GAIA and Humanity's Last Exam.
In recent years, LLM-based multi-agent systems (MAS) have advanced rapidly, using a router to decompose tasks and delegate subtasks to specialized agents. A natural way to expand capability is to scale up the agent pool by continually integrating new functional agents or tool interfaces, but naive expansion can trigger performance collapse when the router cold-starts on newly added, heterogeneous, and unreliable agents. We propose MonoScale, an expansion-aware update framework that proactively generates a small set of agent-conditioned familiarization tasks, harvests evidence from both successful and failed interactions, and distills it into auditable natural-language memory to guide future routing. We formalize sequential augmentation as a contextual bandit and perform trust-region memory updates, yielding a monotonic non-decreasing performance guarantee across onboarding rounds. Experiments on GAIA and Humanity's Last Exam show stable gains as the agent pool grows, outperforming naive scale-up and strong-router fixed-pool baselines.