CLAIFeb 9

Latent Reasoning with Supervised Thinking States

arXiv:2602.08332v13 citationsh-index: 37
Originality Incremental advance
AI Analysis

This addresses efficiency issues for users of large language models in reasoning-heavy applications, though it is an incremental improvement over existing latent reasoning methods.

The paper tackles the high inference cost of chain-of-thought reasoning in LLMs by proposing Thinking States, a method that generates reasoning tokens during input processing, which reduces latency and matches or outperforms CoT on tasks like math problems and 2-Hop QA.

Reasoning with a chain-of-thought (CoT) enables Large Language Models (LLMs) to solve complex tasks but incurs significant inference costs due to the generation of long rationales. We propose Thinking States, a method that performs reasoning {\em while} the input is processing. Specifically, Thinking States generates sequences of thinking tokens every few input tokens, transforms the thoughts back into embedding space, and adds them to the following input tokens. This has two key advantages. First, it captures the recurrent nature of CoT, but where the thought tokens are generated as input is processing. Second, since the thoughts are represented as tokens, they can be learned from natural language supervision, and using teacher-forcing, which is parallelizable. Empirically, Thinking States outperforms other latent reasoning methods on multiple reasoning tasks, narrowing the gap to CoT on math problems, and matching its performance on 2-Hop QA with improved latency. On state-tracking tasks, we show Thinking States leads to stronger reasoning behavior than CoT, successfully extrapolating to longer sequences than seen during training.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes