DIS-NNLGMLOct 6, 2025

Learning Linear Regression with Low-Rank Tasks in-Context

arXiv:2510.04548v12 citationsh-index: 1
Originality Highly original
AI Analysis

This work provides theoretical insights into how transformers learn task structures, which is incremental for advancing the understanding of in-context learning in AI.

The paper tackles the problem of understanding in-context learning mechanisms in transformers by analyzing a linear attention model trained on low-rank regression tasks, characterizing predictions, generalization error, and identifying a phase transition in error governed by task structure.

In-context learning (ICL) is a key building block of modern large language models, yet its theoretical mechanisms remain poorly understood. It is particularly mysterious how ICL operates in real-world applications where tasks have a common structure. In this work, we address this problem by analyzing a linear attention model trained on low-rank regression tasks. Within this setting, we precisely characterize the distribution of predictions and the generalization error in the high-dimensional limit. Moreover, we find that statistical fluctuations in finite pre-training data induce an implicit regularization. Finally, we identify a sharp phase transition of the generalization error governed by task structure. These results provide a framework for understanding how transformers learn to learn the task structure.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes