LG CLMay 11

The Truth Lies Somewhere in the Middle (of the Generated Tokens)

Sophie L. Wang, Phillip Isola, Brian Cheung

arXiv:2605.0996976.8

AI Analysis

Provides a simple, effective method for extracting better representations from autoregressive language models, relevant to practitioners using such models for downstream tasks.

Mean pooling across hidden states of autoregressively generated tokens yields more semantic representations than using any single token, outperforming prompt-based representations across language, vision, and protein domains.

How should hidden states generated autoregressively be collapsed into a representation that reflects a language model's internal state? Despite tokens being generated under causal masking, we find that mean pooling across their hidden states yields more semantic representations than any individual token alone. We quantify this through kernel alignment to reference spaces in language, vision, and protein domains. The improvement through mean pooling is consistent with information being distributed across generated tokens rather than localized to a single position. Furthermore, representations derived from generated tokens outperform those from prompt tokens, and alignment across generation reveals interpretable dynamics in model behavior.

View on arXiv PDF

Similar