Mitigating Cognitive Inertia in Large Reasoning Models via Latent Spike Steering
This work addresses a specific failure mode in reasoning models, offering an incremental improvement for AI systems that rely on complex reasoning tasks.
The paper tackled the problem of cognitive inertia in Large Reasoning Models, where models either overthink or become rigid, by proposing STARS, a training-free framework that monitors latent dynamics to detect and correct these failures, resulting in improved accuracy and reduced redundant loops across diverse benchmarks.
While Large Reasoning Models (LRMs) have achieved remarkable performance by scaling test-time compute, they frequently suffer from Cognitive Inertia, a failure pattern manifesting as either overthinking (inertia of motion) or reasoning rigidity (inertia of direction). Existing detection methods, typically relying on superficial textual heuristics like self-correction tokens, often fail to capture the model's unvoiced internal conflicts. To address this, we propose STARS (Spike-Triggered Adaptive Reasoning Steering), a training-free framework designed to rectify cognitive inertia by monitoring latent dynamics. STARS identifies Cognitive Pivots-critical moments of reasoning transition-by detecting distinct L2 distance spikes in the hidden states. Upon detection, the framework employs geometric trajectory analysis to diagnose the structural nature of the transition and injects state-aware language cues to steer the model in real-time. Our experiments across diverse benchmarks confirm that STARS efficiently curtails redundant loops while improving accuracy through the adaptive correction of erroneous trajectories. STARS offers a robust, unsupervised mechanism to optimize the reasoning process of LRMs without requiring additional fine-tuning.