CLSep 29, 2025

Localizing Task Recognition and Task Learning in In-Context Learning via Attention Head Analysis

arXiv:2509.24164v12.7h-index: 4

Originality Incremental advance

AI Analysis

This provides a unified interpretable account of in-context learning mechanisms, which is incremental as it reconciles existing perspectives.

The paper tackles the problem of understanding how large language models perform in-context learning by proposing a framework to identify attention heads specialized in task recognition and task learning, showing their distinct roles in aligning and rotating hidden states for prediction.

We investigate the mechanistic underpinnings of in-context learning (ICL) in large language models by reconciling two dominant perspectives: the component-level analysis of attention heads and the holistic decomposition of ICL into Task Recognition (TR) and Task Learning (TL). We propose a novel framework based on Task Subspace Logit Attribution (TSLA) to identify attention heads specialized in TR and TL, and demonstrate their distinct yet complementary roles. Through correlation analysis, ablation studies, and input perturbations, we show that the identified TR and TL heads independently and effectively capture the TR and TL components of ICL. Using steering experiments with geometric analysis of hidden states, we reveal that TR heads promote task recognition by aligning hidden states with the task subspace, while TL heads rotate hidden states within the subspace toward the correct label to facilitate prediction. We further show how previous findings on ICL mechanisms, including induction heads and task vectors, can be reconciled with our attention-head-level analysis of the TR-TL decomposition. Our framework thus provides a unified and interpretable account of how large language models execute ICL across diverse tasks and settings.

View on arXiv PDF

Similar