AICLNCMar 17, 2025

Intra-neuronal attention within language models Relationships between activation and semantics

arXiv:2503.12992v1h-index: 3
Originality Synthesis-oriented
AI Analysis

This work addresses how language models form abstractions, but it is incremental as it builds on existing neuron analysis methods.

The study investigated whether neurons in language models can perform intra-neuronal attention by linking activation patterns to categorical segments, finding a weak relationship only for tokens with very high activation levels.

This study investigates the ability of perceptron-type neurons in language models to perform intra-neuronal attention; that is, to identify different homogeneous categorical segments within the synthetic thought category they encode, based on a segmentation of specific activation zones for the tokens to which they are particularly responsive. The objective of this work is therefore to determine to what extent formal neurons can establish a homomorphic relationship between activation-based and categorical segmentations. The results suggest the existence of such a relationship, albeit tenuous, only at the level of tokens with very high activation levels. This intra-neuronal attention subsequently enables categorical restructuring processes at the level of neurons in the following layer, thereby contributing to the progressive formation of high-level categorical abstractions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes