Spin glass model of in-context learning

arXiv:2408.02288v32 citationsh-index: 3
Originality Incremental advance
AI Analysis

This work addresses the challenge of understanding in-context learning for researchers in machine learning and physics, offering an incremental theoretical framework.

The authors tackled the problem of providing a mechanistic interpretation of in-context learning in large language models by mapping a transformer with linear attention to a spin glass model, revealing that increasing task diversity during pre-training enables the emergence of in-context learning without further training.

Large language models show a surprising in-context learning ability -- being able to use a prompt to form a prediction for a query, yet without additional training, in stark contrast to old-fashioned supervised learning. Providing a mechanistic interpretation and linking the empirical phenomenon to physics are thus challenging and remain unsolved. We study a simple yet expressive transformer with linear attention and map this structure to a spin glass model with real-valued spins, where the couplings and fields explain the intrinsic disorder in data. The spin glass model explains how the weight parameters interact with each other during pre-training, and further clarifies why an unseen function can be predicted by providing only a prompt yet without further training. Our theory reveals that for single-instance learning, increasing the task diversity leads to the emergence of in-context learning, by allowing the Boltzmann distribution to converge to a unique correct solution of weight parameters. Therefore the pre-trained transformer displays a prediction power in a novel prompt setting. The proposed analytically tractable model thus offers a promising avenue for thinking about how to interpret many intriguing but puzzling properties of large language models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes