Quantifier Scope Interpretation in Language Learners and LLMs
It addresses how LLMs handle linguistic ambiguities across languages, which is incremental for understanding model alignment with human cognition.
This study examined how large language models (LLMs) interpret quantifier scope ambiguities in English and Chinese, finding that most LLMs prefer surface scope interpretations similar to humans, with some showing language-specific differences in inverse scope preferences.
Sentences with multiple quantifiers often lead to interpretive ambiguities, which can vary across languages. This study adopts a cross-linguistic approach to examine how large language models (LLMs) handle quantifier scope interpretation in English and Chinese, using probabilities to assess interpretive likelihood. Human similarity (HS) scores were used to quantify the extent to which LLMs emulate human performance across language groups. Results reveal that most LLMs prefer the surface scope interpretations, aligning with human tendencies, while only some differentiate between English and Chinese in the inverse scope preferences, reflecting human-similar patterns. HS scores highlight variability in LLMs' approximation of human behavior, but their overall potential to align with humans is notable. Differences in model architecture, scale, and particularly models' pre-training data language background, significantly influence how closely LLMs approximate human quantifier scope interpretations.