AIFeb 22

Limited Reasoning Space: The cage of long-horizon reasoning in LLMs

Zhenyu Li, Guanlin Wu, Cheems Wang, Yongqiang Zhao

arXiv:2602.19281v11 citationsh-index: 1

Originality Incremental advance

AI Analysis

This addresses a critical bottleneck in enhancing reasoning capabilities for AI systems handling complex tasks, though it is an incremental improvement over existing planning methods.

The paper tackles the problem of performance collapse in large language models when increasing compute budgets for long-horizon reasoning tasks, proposing Halo, a model predictive control framework that dynamically regulates planning to achieve controllable reasoning, resulting in improved performance over static baselines.

The test-time compute strategy, such as Chain-of-Thought (CoT), has significantly enhanced the ability of large language models to solve complex tasks like logical reasoning. However, empirical studies indicate that simply increasing the compute budget can sometimes lead to a collapse in test-time performance when employing typical task decomposition strategies such as CoT. This work hypothesizes that reasoning failures with larger compute budgets stem from static planning methods, which hardly perceive the intrinsic boundaries of LLM reasoning. We term it as the Limited Reasoning Space hypothesis and perform theoretical analysis through the lens of a non-autonomous stochastic dynamical system. This insight suggests that there is an optimal range for compute budgets; over-planning can lead to redundant feedback and may even impair reasoning capabilities. To exploit the compute-scaling benefits and suppress over-planning, this work proposes Halo, a model predictive control framework for LLM planning. Halo is designed for long-horizon tasks with reason-based planning and crafts an entropy-driven dual controller, which adopts a Measure-then-Plan strategy to achieve controllable reasoning. Experimental results demonstrate that Halo outperforms static baselines on complex long-horizon tasks by dynamically regulating planning at the reasoning boundary.

View on arXiv PDF

Similar