LGJun 1, 2025

LLM Cannot Discover Causality, and Should Be Restricted to Non-Decisional Support in Causal Discovery

arXiv:2506.00844v16 citationsh-index: 9
Originality Incremental advance
AI Analysis

This addresses the problem of unreliable causal discovery in AI for researchers and practitioners, highlighting an incremental shift in methodology.

The paper argues that LLMs lack theoretical grounding for causal reasoning and should not be used to determine causal relationships, but can assist as non-decisional heuristics to accelerate convergence and outperform existing methods in causal structure learning.

This paper critically re-evaluates LLMs' role in causal discovery and argues against their direct involvement in determining causal relationships. We demonstrate that LLMs' autoregressive, correlation-driven modeling inherently lacks the theoretical grounding for causal reasoning and introduces unreliability when used as priors in causal discovery algorithms. Through empirical studies, we expose the limitations of existing LLM-based methods and reveal that deliberate prompt engineering (e.g., injecting ground-truth knowledge) could overstate their performance, helping to explain the consistently favorable results reported in much of the current literature. Based on these findings, we strictly confined LLMs' role to a non-decisional auxiliary capacity: LLMs should not participate in determining the existence or directionality of causal relationships, but can assist the search process for causal graphs (e.g., LLM-based heuristic search). Experiments across various settings confirm that, by strictly isolating LLMs from causal decision-making, LLM-guided heuristic search can accelerate the convergence and outperform both traditional and LLM-based methods in causal structure learning. We conclude with a call for the community to shift focus from naively applying LLMs to developing specialized models and training method that respect the core principles of causal discovery.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes