CLOct 18, 2022

Systematicity in GPT-3's Interpretation of Novel English Noun Compounds

Georgia TechStanford
arXiv:2210.09492v1293 citationsh-index: 58
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of understanding systematic reasoning in large language models for researchers in AI and linguistics, showing incremental insights into model limitations.

The study investigated whether GPT-3 interprets novel English noun compounds using abstract conceptual categories like artifacts vs. natural kinds, as humans do, but found it likely relies on low-level lexical patterns instead.

Levin et al. (2019) show experimentally that the interpretations of novel English noun compounds (e.g., stew skillet), while not fully compositional, are highly predictable based on whether the modifier and head refer to artifacts or natural kinds. Is the large language model GPT-3 governed by the same interpretive principles? To address this question, we first compare Levin et al.'s experimental data with GPT-3 generations, finding a high degree of similarity. However, this evidence is consistent with GPT3 reasoning only about specific lexical items rather than the more abstract conceptual categories of Levin et al.'s theory. To probe more deeply, we construct prompts that require the relevant kind of conceptual reasoning. Here, we fail to find convincing evidence that GPT-3 is reasoning about more than just individual lexical items. These results highlight the importance of controlling for low-level distributional regularities when assessing whether a large language model latently encodes a deeper theory.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes