CLFeb 18, 2025

Linguistic Generalizations are not Rules: Impacts on Evaluation of LMs

Leonie Weissweiler, Kyle Mahowald, Adele Goldberg

arXiv:2502.13195v315.510 citationsh-index: 4

Originality Incremental advance

AI Analysis

This challenges evaluation practices in NLP by proposing a shift toward benchmarks that assess LMs' ability to handle gradient and contextual aspects of language, potentially impacting researchers and practitioners in AI and linguistics.

The paper argues that evaluating language models (LMs) based on symbolic rules is flawed because natural languages rely on flexible, context-dependent constructions rather than strict rules, suggesting that LMs' failures to obey such rules might indicate they capture more realistic linguistic generalizations.

Linguistic evaluations of how well LMs generalize to produce or understand language often implicitly take for granted that natural languages are generated by symbolic rules. According to this perspective, grammaticality is determined by whether sentences obey such rules. Interpretation is compositionally generated by syntactic rules operating on meaningful words. Semantic parsing maps sentences into formal logic. Failures of LMs to obey strict rules are presumed to reveal that LMs do not produce or understand language like humans. Here we suggest that LMs' failures to obey symbolic rules may be a feature rather than a bug, because natural languages are not based on neatly separable, compositional rules. Rather, new utterances are produced and understood by a combination of flexible, interrelated, and context-dependent constructions. Considering gradient factors such as frequencies, context, and function will help us reimagine new benchmarks and analyses to probe whether and how LMs capture the rich, flexible generalizations that comprise natural languages.

View on arXiv PDF

Similar