CLOct 21, 2025

Topoformer: brain-like topographic organization in Transformer language models through spatial querying and reweighting

Taha Binhuraib, Greta Tuckute, Nicholas Blauch

arXiv:2510.18745v110.96 citationsh-index: 15

Originality Incremental advance

AI Analysis

This work addresses the problem of interpretability in NLP models for researchers, offering a method to mimic brain-like organization, though it is incremental as it modifies existing architectures without major performance gains.

The authors tackled the lack of spatial organization in Transformer models by introducing Topoformer, a variant with topographic organization through spatial querying and reweighting, achieving performance on par with standard models on NLP benchmarks while enabling interpretable alignment with human brain data.

Spatial functional organization is a hallmark of biological brains: neurons are arranged topographically according to their response properties, at multiple scales. In contrast, representations within most machine learning models lack spatial biases, instead manifesting as disorganized vector spaces that are difficult to visualize and interpret. Here, we propose a novel form of self-attention that turns Transformers into "Topoformers" with topographic organization. We introduce spatial querying - where keys and queries are arranged on 2D grids, and local pools of queries are associated with a given key - and spatial reweighting, where we convert the standard fully connected layer of self-attention into a locally connected layer. We first demonstrate the feasibility of our approach by training a 1-layer Topoformer on a sentiment classification task. Training with spatial querying encourages topographic organization in the queries and keys, and spatial reweighting separately encourages topographic organization in the values and self-attention outputs. We then apply the Topoformer motifs at scale, training a BERT architecture with a masked language modeling objective. We find that the topographic variant performs on par with a non-topographic control model on NLP benchmarks, yet produces interpretable topographic organization as evaluated via eight linguistic test suites. Finally, analyzing an fMRI dataset of human brain responses to a large set of naturalistic sentences, we demonstrate alignment between low-dimensional topographic variability in the Topoformer model and human brain language network. Scaling up Topoformers further holds promise for greater interpretability in NLP research, and for more accurate models of the organization of linguistic information in the human brain.

View on arXiv PDF

Similar