CLAINov 1, 2025

Test-time Scaling of LLMs: A Survey from A Subproblem Structure Perspective

arXiv:2511.14772v12 citationsh-index: 2
Originality Synthesis-oriented
AI Analysis

This is an incremental survey that organizes existing test-time scaling methods for LLMs, aiding researchers in understanding and advancing inference optimization.

The paper surveys techniques for improving predictive accuracy of pretrained large language models by allocating additional compute at inference time, categorizing methods based on subproblem decomposition and topological organization to unify approaches like Chain-of-Thought and Tree-of-Thought.

With this paper, we survey techniques for improving the predictive accuracy of pretrained large language models by allocating additional compute at inference time. In categorizing test-time scaling methods, we place special emphasis on how a problem is decomposed into subproblems and on the topological organization of these subproblems whether sequential, parallel, or tree-structured. This perspective allows us to unify diverse approaches such as Chain-of-Thought, Branch-Solve-Merge, and Tree-of-Thought under a common lens. We further synthesize existing analyses of these techniques, highlighting their respective strengths and weaknesses, and conclude by outlining promising directions for future research

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes