AIFeb 4

Fluid Representations in Reasoning Models

Dmitrii Kharlapenko, Alessandro Stolfo, Arthur Conmy, Mrinmaya Sachan, Zhijing Jin

arXiv:2602.04843v12.4h-index: 40

Originality Synthesis-oriented

AI Analysis

This provides mechanistic insights into reasoning models for AI researchers, but it is incremental as it builds on existing work without introducing a new method.

The paper tackled the problem of understanding how reasoning language models achieve superior performance by analyzing QwQ-32B on the Mystery Blocksworld domain, finding that the model gradually refines internal representations of actions and concepts during reasoning, with steering experiments showing that injecting refined representations boosts accuracy and symbolic ones can replace obfuscated encodings with minimal loss.

Reasoning language models, which generate long chains of thought, dramatically outperform non-reasoning language models on abstract problems. However, the internal model mechanisms that allow this superior performance remain poorly understood. We present a mechanistic analysis of how QwQ-32B - a model specifically trained to produce extensive reasoning traces - process abstract structural information. On Mystery Blocksworld - a semantically obfuscated planning domain - we find that QwQ-32B gradually improves its internal representation of actions and concepts during reasoning. The model develops abstract encodings that focus on structure rather than specific action names. Through steering experiments, we establish causal evidence that these adaptations improve problem solving: injecting refined representations from successful traces boosts accuracy, while symbolic representations can replace many obfuscated encodings with minimal performance loss. We find that one of the factors driving reasoning model performance is in-context refinement of token representations, which we dub Fluid Reasoning Representations.

View on arXiv PDF

Similar