CLAIJul 29, 2025

Meaning-infused grammar: Gradient Acceptability Shapes the Geometric Representations of Constructions in LLMs

arXiv:2507.22286v22 citationsh-index: 2
Originality Incremental advance
AI Analysis

This provides evidence for usage-based constructionist theories in LLMs, but it is incremental as it focuses on specific constructions in one model.

The study investigated whether large language models (LLMs) learn graded, meaning-infused representations of English grammatical constructions, finding that the separability of representations in Pythia-1.4B systematically varies with human-rated preference strength, as measured by energy distance and Jensen-Shannon divergence.

The usage-based constructionist (UCx) approach to language posits that language comprises a network of learned form-meaning pairings (constructions) whose use is largely determined by their meanings or functions, requiring them to be graded and probabilistic. This study investigates whether the internal representations in Large Language Models (LLMs) reflect the proposed function-infused gradience. We analyze representations of the English Double Object (DO) and Prepositional Object (PO) constructions in Pythia-$1.4$B, using a dataset of $5000$ sentence pairs systematically varied by human-rated preference strength for DO or PO. Geometric analyses show that the separability between the two constructions' representations, as measured by energy distance or Jensen-Shannon divergence, is systematically modulated by gradient preference strength, which depends on lexical and functional properties of sentences. That is, more prototypical exemplars of each construction occupy more distinct regions in activation space, compared to sentences that could have equally well have occured in either construction. These results provide evidence that LLMs learn rich, meaning-infused, graded representations of constructions and offer support for geometric measures for representations in LLMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes