CodeDelegator: Mitigating Context Pollution via Role Separation in Code-as-Action Agents
This addresses a bottleneck in AI agents for real-world tasks requiring both strategic planning and detailed code implementation, representing an incremental improvement.
The paper tackles the problem of context pollution in code-as-action agents where a single agent handling both planning and implementation leads to performance degradation from debugging traces and failures, and proposes CodeDelegator, a multi-agent framework with role separation that achieved effectiveness across diverse benchmarks.
Recent advances in large language models (LLMs) allow agents to represent actions as executable code, offering greater expressivity than traditional tool-calling. However, real-world tasks often demand both strategic planning and detailed implementation. Using a single agent for both leads to context pollution from debugging traces and intermediate failures, impairing long-horizon performance. We propose CodeDelegator, a multi-agent framework that separates planning from implementation via role specialization. A persistent Delegator maintains strategic oversight by decomposing tasks, writing specifications, and monitoring progress without executing code. For each sub-task, a new Coder agent is instantiated with a clean context containing only its specification, shielding it from prior failures. To coordinate between agents, we introduce Ephemeral-Persistent State Separation (EPSS), which isolates each Coder's execution state while preserving global coherence, preventing debugging traces from polluting the Delegator's context. Experiments on various benchmarks demonstrate the effectiveness of CodeDelegator across diverse scenarios.