Probing Minimalist Phase Structure in LLMs: What Universal Dependencies Cannot Represent
For computational linguists and cognitive scientists, this demonstrates that LLMs learn formal-syntactic abstractions invisible to UD-based probes, revealing that UD provides a lower bound on syntactic encoding.
The paper shows that LLMs encode Minimalist Program phase structure beyond what Universal Dependencies (UD) can represent, with 12/13 models showing a phase-count gradient and 13/13 showing a sign asymmetry predicted by phase-internal cohesion, and activation patching confirms causal relevance in 12/13 models.
Structural probes train on Universal Dependencies (UD), which does not encode formal-syntactic abstractions such as phase boundaries or phase-internal cohesion. Whether large language models (LLMs) encode these remains an open question that UD-based probing cannot answer by construction. We evaluate structural probes on wh-movement stimuli where UD distances are invariant across conditions by design -- any non-zero effect therefore reflects structure beyond UD. The three conditions -- bare small clause, infinitival, and finite -- are ordered by the number of Minimalist Program (MP) phase boundaries the wh-element crosses. Across 13 LLMs from four families, we find a phase-count gradient on a cross-clause pair (12/13 models) and a 13/13 sign asymmetry on a within-clause pair whose UD distance is identical across conditions -- the latter specifically predicted by phase-internal cohesion, an MP abstraction invisible to UD by construction. Activation patching confirms the representations are causally active in 12/13 models. These findings suggest that distributional pretraining can induce representations aligned with formal-syntactic abstractions beyond the reach of annotation-based probing; UD-grounded probes provide a lower bound on syntactic encoding, not an upper bound.