AIMay 29

Learning Agent-Compatible Context Management for Long-Horizon Tasks

Lu Yi, Runlin Lei, Liuyi Yao, Yuexiang Xie, Yuyang Li, Wenhao Zhang, Zhewei Wei, Yaliang Li, Jian-Yun Nie

arXiv:2605.3078599.1h-index: 7

Predicted impact top 1% in AI · last 90 daysOriginality Highly original

AI Analysis

This work is significant for developers and users of LLM agents, as it provides a practical method to improve the performance of existing, potentially closed-source, agents on long-horizon tasks without modifying the agent itself.

This paper addresses the problem of context degradation and reasoning failures in LLM agents performing long-horizon tasks by introducing Adaptive Context Management (AdaCoM). AdaCoM trains an external LLM to manage the context of a frozen agent, leading to substantial performance improvements across web search and deep research benchmarks.

LLM agents increasingly face long-horizon tasks such as web search and deep research in real-world applications, where accumulated context can cause long-context degradation and reasoning failures. Prior work mitigates this through context management with agent-side context control or fixed strategies such as summarization, which require training the agent itself for adaptation - making it impractical for closed-source agents and ignoring that different agents may require different strategies. We introduce Adaptive Context Management (AdaCoM), which trains an external LLM to manage the context of a frozen agent through flexible modification actions and end-to-end reinforcement learning. Across diverse agents on web search and deep research benchmarks, AdaCoM substantially improves performance by preserving task constraints and progress while pruning stale content. The learned strategies reveal a Fidelity-Reliability Trade-off: agents with higher vanilla ReAct performance benefit from higher-fidelity context preservation, whereas lower-performing agents require more aggressive compression to stay within a reliable reasoning regime. Transfer experiments show that AdaCoM generalizes most effectively across agents with similar capability (measured by vanilla ReAct performance), suggesting a practical path toward reusable context managers for agent systems.

View on arXiv PDF

Similar