AIMay 1

LLMs Should Not Yet Be Credited with Decision Explanation

arXiv:2605.0116449.1

Predicted impact top 70% in AI · last 90 daysOriginality Incremental advance

AI Analysis

For researchers using LLMs to model human decision-making, this paper warns against conflating prediction with explanation, which could misdirect progress.

This position paper argues that LLMs should not be credited with decision explanation, as current evidence supports prediction and rationale generation but not true explanation. It proposes a bridge standard for crediting explanations and a principle of credit calibration to prevent overclaiming.

This position paper argues that LLMs should not yet be credited with decision explanation. This matters because recent work increasingly treats accurate behavioral prediction, plausible rationales, and outcome-conditioned reasoning traces as evidence that LLMs explain why people decide as they do, risking a premature redefinition of what counts as explanatory progress in human decision modeling. We first distinguish three claims with different evidential burdens: decision prediction, rationale generation, and decision explanation. We then argue that the evidence most commonly offered for LLM-based decision accounts directly supports the first two claims, and sometimes explanatory hypothesis generation, but does not distinguish decision explanation from prediction-supportive rationalization. Next, we propose a bridge standard for decision-explanation credit: stronger claims should specify explanatory targets, discriminate against weaker rationalizer alternatives, use target-appropriate process- or intervention-sensitive validation, and bound their scope. We then situate this standard against competing views and related literatures, clarifying why it preserves the value of LLMs as predictors, narrators, and hypothesis generators while resisting premature explanatory credit. We conclude with a principle of credit calibration: LLMs should be credited for the strongest claim their evidence warrants, and no stronger; if adopted, this principle can help turn LLMs from persuasive narrators of decisions into more reliable instruments for discovering, testing, and communicating explanations of human behavior.

View on arXiv PDF

Similar