CLMay 12

Geometric Factual Recall in Transformers

Shauli Ravfogel, Gilad Yehudai, Joan Bruna, Alberto Bietti

arXiv:2605.1242695.0

AI Analysis

For researchers in mechanistic interpretability and language model scaling, this work provides a theoretical foundation for a more efficient form of memorization, challenging the prevailing associative memory view.

The paper shows that transformers can memorize factual associations using logarithmic embedding dimension via geometric superposition of attribute vectors, with the MLP acting as a generic selector rather than a key-value memory. Empirical results confirm that gradient descent discovers this structure and enables zero-shot transfer to new facts.

How do transformer language models memorize factual associations? A common view casts internal weight matrices as associative memories over pairs of embeddings, requiring parameter counts that scale linearly with the number of facts. We develop a theoretical and empirical account of an alternative, \emph{geometric} form of memorization in which learned embeddings encode relational structure directly, and the MLP plays a qualitatively different role. In a controlled setting where a single-layer transformer must memorize random bijections from subjects to a shared attribute set, we prove that a logarithmic embedding dimension suffices: subject embeddings encode \emph{linear superpositions} of their associated attribute vectors, and a small MLP acts as a relation-conditioned selector that extracts the relevant attribute via ReLU gating, and not as an associative key-value mapping. We extend these results to the multi-hop setting -- chains of relational queries such as ``Who is the mother of the wife of $x$?'' -- providing constructions with and without chain-of-thought that exhibit a provable capacity-depth tradeoff, complemented by a matching information-theoretic lower bound. Empirically, gradient descent discovers solutions with precisely the predicted structure. Once trained, the MLP transfers zero-shot to entirely new bijections when subject embeddings are appropriately re-initialized, revealing that it has learned a generic selection mechanism rather than memorized any particular set of facts.

View on arXiv PDF

Similar