LGAIApr 16

Layerwise Dynamics for In-Context Classification in Transformers

arXiv:2604.1161366.1h-index: 2
Predicted impact top 30% in LG · last 90 daysOriginality Highly original
AI Analysis

For researchers studying in-context learning, this provides the first end-to-end identified algorithm inside a softmax transformer, enabling interpretability of the inference-time computation.

The paper derives an explicit, depth-indexed recursion for in-context classification in transformers, revealing an emergent update rule that amplifies class separation and yields robust expected class alignment.

Transformers can perform in-context classification from a few labeled examples, yet the inference-time algorithm remains opaque. We study multi-class linear classification in the hard no-margin regime and make the computation identifiable by enforcing feature- and label-permutation equivariance at every layer. This enables interpretability while maintaining functional equivalence and yields highly structured weights. From these models we extract an explicit depth-indexed recursion: an end-to-end identified, emergent update rule inside a softmax transformer, to our knowledge the first of its kind. Attention matrices formed from mixed feature-label Gram structure drive coupled updates of training points, labels, and the test probe. The resulting dynamics implement a geometry-driven algorithmic motif, which can provably amplify class separation and yields robust expected class alignment.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes