ContextBudget: Budget-Aware Context Management for Long-Horizon Search Agents
This addresses the context budget constraint for LLM-based agents in long-horizon tasks like QA and web browsing, representing an incremental improvement with a novel method for a known bottleneck.
The paper tackles the problem of limited context size in LLM-based agents for long-horizon reasoning by proposing Budget-Aware Context Management (BACM), which formulates context management as a sequential decision problem with a budget constraint, and experiments show BACM-RL achieves over 1.6x gains over baselines in high-complexity settings.
LLM-based agents show strong potential for long-horizon reasoning, yet their context size is limited by deployment factors (e.g., memory, latency, and cost), yielding a constrained context budget. As interaction histories grow, this induces a trade-off between retaining past information and staying within the context limit. To address this challenge, we propose Budget-Aware Context Management (BACM), which formulates context management as a sequential decision problem with a context budget constraint. It enables agents to assess the available budget before incorporating new observations and decide when and how much of the interaction history to compress. We further develop BACM-RL, an end-to-end curriculum-based reinforcement learning approach that learns compression strategies under varying context budgets. Experiments on compositional multi-objective QA and long-horizon web browsing benchmarks show that BACM-RL consistently outperforms prior methods across model scales and task complexities, achieving over $1.6\times$ gains over strong baselines in high-complexity settings, while maintaining strong advantages as budgets shrink, where most methods exhibit a downward performance trend.