LGDIS-NNSep 24, 2024

Self-attention as an attractor network: transient memories without backpropagation

arXiv:2409.16112v17 citationsh-index: 4
Originality Incremental advance
AI Analysis

This provides a novel physics-inspired framework for understanding transformers, which is foundational for ML/AI but incremental in method.

The paper tackles the problem of interpreting self-attention in transformers by showing it can be derived from local energy terms, enabling a recurrent model trained without backpropagation that exhibits transient states correlated with data.

Transformers are one of the most successful architectures of modern neural networks. At their core there is the so-called attention mechanism, which recently interested the physics community as it can be written as the derivative of an energy function in certain cases: while it is possible to write the cross-attention layer as a modern Hopfield network, the same is not possible for the self-attention, which is used in the GPT architectures and other autoregressive models. In this work we show that it is possible to obtain the self-attention layer as the derivative of local energy terms, which resemble a pseudo-likelihood. We leverage the analogy with pseudo-likelihood to design a recurrent model that can be trained without backpropagation: the dynamics shows transient states that are strongly correlated with both train and test examples. Overall we present a novel framework to interpret self-attention as an attractor network, potentially paving the way for new theoretical approaches inspired from physics to understand transformers.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes