LGCLMay 27, 2025

Born a Transformer -- Always a Transformer? On the Effect of Pretraining on Architectural Abilities

arXiv:2505.21785v35 citationsh-index: 9
Originality Incremental advance
AI Analysis

This work addresses reliability risks in large language models by showing that pretraining does not eliminate architectural constraints, which is important for researchers and practitioners in AI safety and model evaluation.

The study investigated whether large-scale pretraining enables transformers to overcome their theoretical limitations in modeling sequence-to-sequence tasks, specifically focusing on retrieval and copying tasks, and found that pretraining selectively enhances capabilities like induction over anti-induction but does not overcome fundamental length-generalization limits.

Transformers have theoretical limitations in modeling certain sequence-to-sequence tasks, yet it remains largely unclear if these limitations play a role in large-scale pretrained LLMs, or whether LLMs might effectively overcome these constraints in practice due to the scale of both the models themselves and their pretraining data. We explore how these architectural constraints manifest after pretraining, by studying a family of $\textit{retrieval}$ and $\textit{copying}$ tasks inspired by Liu et al. [2024a]. We use a recently proposed framework for studying length generalization [Huang et al., 2025] to provide guarantees for each of our settings. Empirically, we observe an $\textit{induction-versus-anti-induction}$ asymmetry, where pretrained models are better at retrieving tokens to the right (induction) rather than the left (anti-induction) of a query token. This asymmetry disappears upon targeted fine-tuning if length-generalization is guaranteed by theory. Mechanistic analysis reveals that this asymmetry is connected to the differences in the strength of induction versus anti-induction circuits within pretrained transformers. We validate our findings through practical experiments on real-world tasks demonstrating reliability risks. Our results highlight that pretraining selectively enhances certain transformer capabilities, but does not overcome fundamental length-generalization limits.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes