CLOct 2, 2023

ARN: Analogical Reasoning on Narratives

arXiv:2310.00996v426 citationsh-index: 12
Originality Incremental advance
AI Analysis

This addresses a gap in NLP for analogical reasoning on narratives, offering a new benchmark, but it is incremental as it extends existing word-based methods to narratives.

The paper tackles the problem of whether large language models (LLMs) can detect system analogies between narratives, showing that while LLMs perform well on near analogies, they struggle with far analogies, with GPT4.0 scoring below random in zero-shot settings, and even few-shot improvements only reach halfway to human performance.

As a core cognitive skill that enables the transferability of information across domains, analogical reasoning has been extensively studied for both humans and computational models. However, while cognitive theories of analogy often focus on narratives and study the distinction between surface, relational, and system similarities, existing work in natural language processing has a narrower focus as far as relational analogies between word pairs. This gap brings a natural question: can state-of-the-art large language models (LLMs) detect system analogies between narratives? To gain insight into this question and extend word-based relational analogies to relational system analogies, we devise a comprehensive computational framework that operationalizes dominant theories of analogy, using narrative elements to create surface and system mappings. Leveraging the interplay between these mappings, we create a binary task and benchmark for Analogical Reasoning on Narratives (ARN), covering four categories of far (cross-domain)/near (within-domain) analogies and disanalogies. We show that while all LLMs can largely recognize near analogies, even the largest ones struggle with far analogies in a zero-shot setting, with GPT4.0 scoring below random. Guiding the models through solved examples and chain-of-thought reasoning enhances their analogical reasoning ability. Yet, since even in the few-shot setting, the best model only performs halfway between random and humans, ARN opens exciting directions for computational analogical reasoners.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes