CLAIJun 21, 2023

Evaluating Large Language Models with NeuBAROCO: Syllogistic Reasoning Ability and Human-like Biases

arXiv:2306.12567v1143 citationsh-index: 22
Originality Synthesis-oriented
AI Analysis

This addresses the issue of evaluating logical reasoning biases in AI for cognitive science and AI safety, but it is incremental as it applies existing methods to a new dataset.

The paper tackled the problem of whether large language models exhibit human-like biases in syllogistic reasoning, finding that models struggle more with problems involving belief biases, conversion errors, and atmosphere effects.

This paper investigates whether current large language models exhibit biases in logical reasoning, similar to humans. Specifically, we focus on syllogistic reasoning, a well-studied form of inference in the cognitive science of human deduction. To facilitate our analysis, we introduce a dataset called NeuBAROCO, originally designed for psychological experiments that assess human logical abilities in syllogistic reasoning. The dataset consists of syllogistic inferences in both English and Japanese. We examine three types of biases observed in human syllogistic reasoning: belief biases, conversion errors, and atmosphere effects. Our findings demonstrate that current large language models struggle more with problems involving these three types of biases.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes