CLAIAug 30, 2023

Response: Emergent analogical reasoning in large language models

arXiv:2308.16118v220 citationsh-index: 31
Originality Synthesis-oriented
AI Analysis

This is an incremental critique that questions the robustness of AI reasoning claims, relevant for researchers in AI and cognitive science.

The authors challenge claims that large language models like GPT-3 exhibit emergent zero-shot analogical reasoning by showing it fails on simple variations of letter string analogy tasks, while humans perform consistently well.

In their recent Nature Human Behaviour paper, "Emergent analogical reasoning in large language models," (Webb, Holyoak, and Lu, 2023) the authors argue that "large language models such as GPT-3 have acquired an emergent ability to find zero-shot solutions to a broad range of analogy problems." In this response, we provide counterexamples of the letter string analogies. In our tests, GPT-3 fails to solve simplest variations of the original tasks, whereas human performance remains consistently high across all modified versions. Zero-shot reasoning is an extraordinary claim that requires extraordinary evidence. We do not see that evidence in our experiments. To strengthen claims of humanlike reasoning such as zero-shot reasoning, it is important that the field develop approaches that rule out data memorization.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes