CLMar 26, 2025

Both Direct and Indirect Evidence Contribute to Dative Alternation Preferences in Language Models

arXiv:2503.20850v310 citationsh-index: 11
Originality Incremental advance
AI Analysis

This addresses the problem of understanding syntactic learning mechanisms in language models for researchers in computational linguistics and AI, though it is incremental in building on prior work on language biases.

The study investigated whether language models' preferences for English dative alternation arise from direct exposure to syntactic patterns or indirect evidence from general language properties like length and animacy, finding that both sources contribute to these preferences.

Language models (LMs) tend to show human-like preferences on a number of syntactic phenomena, but the extent to which these are attributable to direct exposure to the phenomena or more general properties of language is unclear. We explore this with the English dative alternation (DO: "gave Y the X" vs. PO: "gave the X to Y"), using a controlled rearing paradigm wherein we iteratively train small LMs on systematically manipulated input. We focus on two properties that affect the choice of alternant: length and animacy. Both properties are directly present in datives but also reflect more global tendencies for shorter elements to precede longer ones and animates to precede inanimates. First, by manipulating and ablating datives for these biases in the input, we show that direct evidence of length and animacy matters, but easy-first preferences persist even without such evidence. Then, using LMs trained on systematically perturbed datasets to manipulate global length effects (re-linearizing sentences globally while preserving dependency structure), we find that dative preferences can emerge from indirect evidence. We conclude that LMs' emergent syntactic preferences come from a mix of direct and indirect sources.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes