AIFeb 22, 2025

Patterns Over Principles: The Fragility of Inductive Reasoning in LLMs under Noisy Observations

arXiv:2502.16169v218 citationsh-index: 17Has CodeACL
Originality Incremental advance
AI Analysis

This addresses the issue of reasoning robustness in LLMs for AI development, but it is incremental as it builds on existing evaluation tasks.

The paper tackles the problem of LLMs' inductive reasoning fragility under noisy observations, finding that their performance degrades with instability (e.g., 0% accuracy change but only 70% consistent score) and reliance on memorized patterns, while proposing a method that outperforms others with minimal degradation.

Inductive reasoning, a cornerstone of human cognition, enables generalization from limited data but hasn't yet been fully achieved by large language models (LLMs). While modern LLMs excel at reasoning tasks, their ability to maintain stable and consistent rule abstraction under imperfect observations remains underexplored. To fill this gap, in this work, we introduce Robust Rule Induction, a task that evaluates LLMs' capability in inferring rules from data that are fused with noisy examples. To address this task, we further propose Sample-steered Rule Refinement (SRR), a method enhancing reasoning stability via observation diversification and execution-guided feedback. Experiments across arithmetic, cryptography, and list functions reveal: (1) SRR outperforms other methods with minimal performance degradation under noise; (2) Despite slight accuracy variation, LLMs exhibit instability under noise (e.g., 0% accuracy change with only 70% consistent score); (3) Counterfactual task gaps highlight LLMs' reliance on memorized patterns over genuine abstraction. Our findings challenge LLMs' reasoning robustness, revealing susceptibility to hypothesis drift and pattern overfitting, while providing empirical evidence critical for developing human-like inductive systems. Code and data are available at https://github.com/HKUST-KnowComp/Robust-Rule-Induction.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes