AIMay 7

Extracting Search Trees from LLM Reasoning Traces Reveals Myopic Planning

Sixing Chen, Ji-An Li, Saner Cakir, Sinan Akcali, Kayla Lee, Marcelo G. Mattar

arXiv:2605.0684078.0

Predicted impact top 39% in AI · last 90 daysOriginality Incremental advance

AI Analysis

This work reveals a fundamental difference between LLM and human planning, offering guidance for aligning LLM reasoning with human-like strategic thinking.

The authors extract search trees from LLM reasoning traces in the game four-in-a-row and find that LLM planning is myopic: performance is driven by search breadth rather than depth, and move choices ignore deep nodes, contrasting with human planning where deep search drives performance.

Large language models (LLMs), especially reasoning models, generate extended chain-of-thought (CoT) reasoning that often contains explicit deliberation over future outcomes. Yet whether this deliberation constitutes genuine planning, how it is structured, and what aspects of it drive performance remain poorly understood. In this work, we introduce a new method to characterize LLM planning by extracting and quantifying search trees from reasoning traces in the four-in-a-row board game. By fitting computational models on the extracted search trees, we characterize how plans are structured and how they influence move decisions. We find that LLMs' search is shallower than humans', and that performance is predicted by search breadth rather than depth. Most strikingly, although LLMs expand deep nodes in their traces, their move choices are best explained by a myopic model that ignores those nodes entirely. A causal intervention study where we selectively prune CoT paragraphs further suggests that move selection is driven predominantly by shallow rather than deep nodes. These patterns contrast with human planning, where performance is driven primarily by deep search. Together, our findings reveal a key difference between LLM and human planning: while human expertise is driven by deeper search, LLMs do not act on deep lookahead. This dissociation offers targeted guidance for aligning LLM and human planning. More broadly, our framework provides a generalizable approach for interpreting the structure of LLM planning across strategic domains.

View on arXiv PDF

Similar