AIJan 12

Beyond Entangled Planning: Task-Decoupled Planning for Long-Horizon Agents

Yunfan Li, Bingbing Xu, Xueyun Tian, Xiucheng Xu, Huawei Shen

arXiv:2601.07577v13 citationsh-index: 12

Originality Incremental advance

AI Analysis

This addresses the issue of entangled contexts and error propagation in planning for long-horizon agents, offering a more robust and efficient solution, though it is incremental as it builds on existing paradigms.

The paper tackles the problem of planning bottlenecks in long-horizon agents by proposing Task-Decoupled Planning (TDP), which decomposes tasks into sub-goals to isolate reasoning and replanning, resulting in improved performance and up to 82% reduction in token consumption on benchmarks like TravelPlanner, ScienceWorld, and HotpotQA.

Recent advances in large language models (LLMs) have enabled agents to autonomously execute complex, long-horizon tasks, yet planning remains a primary bottleneck for reliable task execution. Existing methods typically fall into two paradigms: step-wise planning, which is reactive but often short-sighted; and one-shot planning, which generates a complete plan upfront yet is brittle to execution errors. Crucially, both paradigms suffer from entangled contexts, where the agent must reason over a monolithic history spanning multiple sub-tasks. This entanglement increases cognitive load and lets local errors propagate across otherwise independent decisions, making recovery computationally expensive. To address this, we propose Task-Decoupled Planning (TDP), a training-free framework that replaces entangled reasoning with task decoupling. TDP decomposes tasks into a directed acyclic graph (DAG) of sub-goals via a Supervisor. Using a Planner and Executor with scoped contexts, TDP confines reasoning and replanning to the active sub-task. This isolation prevents error propagation and corrects deviations locally without disrupting the workflow. Results on TravelPlanner, ScienceWorld, and HotpotQA show that TDP outperforms strong baselines while reducing token consumption by up to 82%, demonstrating that sub-task decoupling improves both robustness and efficiency for long-horizon agents.

View on arXiv PDF

Similar