CLAIMar 29, 2025

The Reasoning-Memorization Interplay in Language Models Is Mediated by a Single Direction

arXiv:2503.23084v121 citationsh-index: 5ACL
Originality Incremental advance
AI Analysis

This work addresses the issue of over-reliance on memorization in LLMs for AI researchers, offering insights for developing more robust and interpretable systems, though it is incremental in building on prior studies of reasoning and memorization dynamics.

The study tackled the problem of understanding when large language models switch between reasoning and memorization during text generation by identifying linear features in the residual stream that control this balance, showing these features can be manipulated to influence model performance on reasoning tasks.

Large language models (LLMs) excel on a variety of reasoning benchmarks, but previous studies suggest they sometimes struggle to generalize to unseen questions, potentially due to over-reliance on memorized training examples. However, the precise conditions under which LLMs switch between reasoning and memorization during text generation remain unclear. In this work, we provide a mechanistic understanding of LLMs' reasoning-memorization dynamics by identifying a set of linear features in the model's residual stream that govern the balance between genuine reasoning and memory recall. These features not only distinguish reasoning tasks from memory-intensive ones but can also be manipulated to causally influence model performance on reasoning tasks. Additionally, we show that intervening in these reasoning features helps the model more accurately activate the most relevant problem-solving capabilities during answer generation. Our findings offer new insights into the underlying mechanisms of reasoning and memory in LLMs and pave the way for the development of more robust and interpretable generative AI systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes