AINov 17, 2025

Cognitive Maps in Language Models: A Mechanistic Analysis of Spatial Planning

Caroline Baumgartner, Eleanor Spens, Neil Burgess, Petru Manescu

arXiv:2511.13371v13.3h-index: 10

Originality Incremental advance

AI Analysis

This provides insights into spatial intelligence in transformers, highlighting a trade-off between generalization and optimization for AI researchers.

The study investigated how large language models solve spatial navigation tasks by training GPT-2 on different paradigms, revealing that models develop either map-like representations or path-dependent algorithms depending on training, with a hybrid model showing improved generalization but retaining path-dependency.

How do large language models solve spatial navigation tasks? We investigate this by training GPT-2 models on three spatial learning paradigms in grid environments: passive exploration (Foraging Model- predicting steps in random walks), goal-directed planning (generating optimal shortest paths) on structured Hamiltonian paths (SP-Hamiltonian), and a hybrid model fine-tuned with exploratory data (SP-Random Walk). Using behavioural, representational and mechanistic analyses, we uncover two fundamentally different learned algorithms. The Foraging model develops a robust, map-like representation of space, akin to a 'cognitive map'. Causal interventions reveal that it learns to consolidate spatial information into a self-sufficient coordinate system, evidenced by a sharp phase transition where its reliance on historical direction tokens vanishes by the middle layers of the network. The model also adopts an adaptive, hierarchical reasoning system, switching between a low-level heuristic for short contexts and map-based inference for longer ones. In contrast, the goal-directed models learn a path-dependent algorithm, remaining reliant on explicit directional inputs throughout all layers. The hybrid model, despite demonstrating improved generalisation over its parent, retains the same path-dependent strategy. These findings suggest that the nature of spatial intelligence in transformers may lie on a spectrum, ranging from generalisable world models shaped by exploratory data to heuristics optimised for goal-directed tasks. We provide a mechanistic account of this generalisation-optimisation trade-off and highlight how the choice of training regime influences the strategies that emerge.

View on arXiv PDF

Similar