Emergent Structured Representations Support Flexible In-Context Inference in Large Language Models
Provides causal evidence that LLMs functionally rely on emergent structured representations for flexible reasoning, addressing a key question in mechanistic interpretability.
LLMs construct a structured conceptual subspace in middle-to-late layers that is causally central to in-context inference, with attention heads in early-to-middle layers building and refining this subspace for later prediction layers.
Large language models (LLMs) exhibit emergent behaviors suggestive of human-like reasoning. While recent work has identified structured conceptual representations within these models, it remains unclear whether they functionally rely on such representations for reasoning. Here we investigate the internal processing of LLMs during in-context inference across diverse tasks. Our results reveal a conceptual subspace emerging in middle to late layers, whose representational structure persists across contexts. Using causal mediation analyses, we demonstrate that this subspace is not merely an epiphenomenon but is functionally central to model predictions, establishing its causal role in inference. We further identify a layer-wise progression where attention heads in early-to-middle layers integrate contextual cues to construct and refine the subspace, which is subsequently leveraged by later layers to generate predictions. Together, these findings provide evidence that LLMs dynamically construct and use structured latent representations in context for inference, offering insights into the computational processes underlying flexible adaptation.