TSLM: Tree-Structured Language Modeling for Divergent Thinking
This addresses the problem of inefficient exploration in reasoning for language model users, proposing a new paradigm of inference-time scaling.
The paper tackles the problem of language models generating reasoning sequentially, which prevents decoupling irrelevant exploration paths during search, by introducing Tree-Structured Language Modeling (TSLM) that uses special tokens to encode branching structure for generating and selectively expanding multiple search paths in a single generation process. The result is robust performance and superior inference efficiency by avoiding multiple independent forward passes required by external search methods.
Language models generate reasoning sequentially, preventing them from decoupling irrelevant exploration paths during search. We introduce Tree-Structured Language Modeling (TSLM), which uses special tokens to encode branching structure, enabling models to generate and selectively expand multiple search paths within a single generation process. By training on complete search trees including both successful and failed attempts, TSLM learns to internalize systematic exploration without redundant recomputation of shared prefixes. TSLM achieves robust performance and superior inference efficiency by avoiding the multiple independent forward passes required by external search methods. These results suggest a new paradigm of inference-time scaling for robust reasoning, demonstrating that supervised learning on complete tree-structured traces provides an efficient alternative for developing systematic exploration capabilities in language models.