CLAIJan 7

Where meaning lives: Layer-wise accessibility of psycholinguistic features in encoder and decoder language models

arXiv:2601.03798v11 citationsh-index: 1
Originality Incremental advance
AI Analysis

This work addresses the problem of interpreting model representations for researchers in NLP and cognitive science, though it is incremental in probing methodology.

The study investigated where transformer language models encode psycholinguistic features, finding that localization is method-dependent and final layers are rarely optimal, but models share a depth ordering of meaning dimensions.

Understanding where transformer language models encode psychologically meaningful aspects of meaning is essential for both theory and practice. We conduct a systematic layer-wise probing study of 58 psycholinguistic features across 10 transformer models, spanning encoder-only and decoder-only architectures, and compare three embedding extraction methods. We find that apparent localization of meaning is strongly method-dependent: contextualized embeddings yield higher feature-specific selectivity and different layer-wise profiles than isolated embeddings. Across models and methods, final-layer representations are rarely optimal for recovering psycholinguistic information with linear probes. Despite these differences, models exhibit a shared depth ordering of meaning dimensions, with lexical properties peaking earlier and experiential and affective dimensions peaking later. Together, these results show that where meaning "lives" in transformer models reflects an interaction between methodological choices and architectural constraints.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes