CLOct 23, 2023

Assessing Step-by-Step Reasoning against Lexical Negation: A Case Study on Syllogism

Mengyu Ye, Tatsuki Kuribayashi, Jun Suzuki, Goro Kobayashi, Hiroaki Funayama

arXiv:2310.14868v121.6136 citationsh-index: 15

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of assessing logical reasoning robustness in LLMs for researchers and developers, but it is incremental as it builds on existing chain-of-thought prompting methods.

The study evaluated the step-by-step reasoning ability of large language models (LLMs) with a focus on lexical negation, revealing that dozens of modern LLMs were not robust against negation in chain-of-thought reasoning, highlighting unique limitations in each model family.

Large language models (LLMs) take advantage of step-by-step reasoning instructions, e.g., chain-of-thought (CoT) prompting. Building on this, their ability to perform CoT-style reasoning robustly is of interest from a probing perspective. In this study, we inspect the step-by-step reasoning ability of LLMs with a focus on negation, which is a core linguistic phenomenon that is difficult to process. In particular, we introduce several controlled settings (e.g., reasoning in case of fictional entities) to evaluate the logical reasoning abilities of the models. We observed that dozens of modern LLMs were not robust against lexical negation (e.g., plausible ->implausible) when performing CoT-style reasoning, and the results highlight unique limitations in each LLM family.

View on arXiv PDF

Similar