CLLGMLFeb 6

Inference-Time Rethinking with Latent Thought Vectors for Math Reasoning

arXiv:2602.06584v12 citationsh-index: 6
Originality Highly original
AI Analysis

This addresses the issue of error recovery in mathematical reasoning for AI systems, offering a novel approach that is incremental in improving reasoning strategies.

The paper tackles the problem of early errors in chain-of-thought reasoning for math by introducing Inference-Time Rethinking, a framework that uses latent thought vectors for iterative self-correction, achieving results that surpass larger baselines, such as a 0.2B-parameter model outperforming a 3B counterpart on GSM8K.

Standard chain-of-thought reasoning generates a solution in a single forward pass, committing irrevocably to each token and lacking a mechanism to recover from early errors. We introduce Inference-Time Rethinking, a generative framework that enables iterative self-correction by decoupling declarative latent thought vectors from procedural generation. We factorize reasoning into a continuous latent thought vector (what to reason about) and a decoder that verbalizes the trace conditioned on this vector (how to reason). Beyond serving as a declarative buffer, latent thought vectors compress the reasoning structure into a continuous representation that abstracts away surface-level token variability, making gradient-based optimization over reasoning strategies well-posed. Our prior model maps unstructured noise to a learned manifold of valid reasoning patterns, and at test time we employ a Gibbs-style procedure that alternates between generating a candidate trace and optimizing the latent vector to better explain that trace, effectively navigating the latent manifold to refine the reasoning strategy. Training a 0.2B-parameter model from scratch on GSM8K, our method with 30 rethinking iterations surpasses baselines with 10 to 15 times more parameters, including a 3B counterpart. This result demonstrates that effective mathematical reasoning can emerge from sophisticated inference-time computation rather than solely from massive parameter counts.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes