CLOct 18, 2022

Systematicity in GPT-3's Interpretation of Novel English Noun Compounds

Siyan Li, Riley Carlson, Christopher Potts

Georgia TechStanford

arXiv:2210.09492v124.2293 citationsh-index: 58

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of understanding systematic reasoning in large language models for researchers in AI and linguistics, showing incremental insights into model limitations.

The study investigated whether GPT-3 interprets novel English noun compounds using abstract conceptual categories like artifacts vs. natural kinds, as humans do, but found it likely relies on low-level lexical patterns instead.

Levin et al. (2019) show experimentally that the interpretations of novel English noun compounds (e.g., stew skillet), while not fully compositional, are highly predictable based on whether the modifier and head refer to artifacts or natural kinds. Is the large language model GPT-3 governed by the same interpretive principles? To address this question, we first compare Levin et al.'s experimental data with GPT-3 generations, finding a high degree of similarity. However, this evidence is consistent with GPT3 reasoning only about specific lexical items rather than the more abstract conceptual categories of Levin et al.'s theory. To probe more deeply, we construct prompts that require the relevant kind of conceptual reasoning. Here, we fail to find convincing evidence that GPT-3 is reasoning about more than just individual lexical items. These results highlight the importance of controlling for low-level distributional regularities when assessing whether a large language model latently encodes a deeper theory.

View on arXiv PDF

Similar