CLAIDec 14, 2025

State over Tokens: Characterizing the Role of Reasoning Tokens

arXiv:2512.12777v14 citations
Originality Incremental advance
AI Analysis

This addresses a gap in understanding LLM reasoning processes for researchers, though it is incremental as it builds on existing observations without new empirical results.

The paper tackles the problem that reasoning tokens in LLMs are not faithful explanations of the model's reasoning process, and introduces the State over Tokens (SoT) framework to reframe them as externalized computational state, explaining their role in driving correct reasoning.

Large Language Models (LLMs) can generate reasoning tokens before their final answer to boost performance on complex tasks. While these sequences seem like human thought processes, empirical evidence reveals that they are not a faithful explanation of the model's actual reasoning process. To address this gap between appearance and function, we introduce the State over Tokens (SoT) conceptual framework. SoT reframes reasoning tokens not as a linguistic narrative, but as an externalized computational state -- the sole persistent information carrier across the model's stateless generation cycles. This explains how the tokens can drive correct reasoning without being a faithful explanation when read as text and surfaces previously overlooked research questions on these tokens. We argue that to truly understand the process that LLMs do, research must move beyond reading the reasoning tokens as text and focus on decoding them as state.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes