CLOct 9, 2025

If Probable, Then Acceptable? Understanding Conditional Acceptability Judgments in Large Language Models

Jasmin Orth, Philipp Mondorf, Barbara Plank

arXiv:2510.08388v12.7h-index: 5

Originality Incremental advance

AI Analysis

This addresses the problem of understanding LLM reasoning for researchers in AI and cognitive science, but it is incremental as it builds on prior work on conditional inferences.

The study investigated how large language models (LLMs) judge the acceptability of conditional statements, finding that models are sensitive to conditional probability and semantic relevance but less consistently than humans, with larger models not necessarily aligning more closely with human judgments.

Conditional acceptability refers to how plausible a conditional statement is perceived to be. It plays an important role in communication and reasoning, as it influences how individuals interpret implications, assess arguments, and make decisions based on hypothetical scenarios. When humans evaluate how acceptable a conditional "If A, then B" is, their judgments are influenced by two main factors: the $\textit{conditional probability}$ of $B$ given $A$, and the $\textit{semantic relevance}$ of the antecedent $A$ given the consequent $B$ (i.e., whether $A$ meaningfully supports $B$). While prior work has examined how large language models (LLMs) draw inferences about conditional statements, it remains unclear how these models judge the $\textit{acceptability}$ of such statements. To address this gap, we present a comprehensive study of LLMs' conditional acceptability judgments across different model families, sizes, and prompting strategies. Using linear mixed-effects models and ANOVA tests, we find that models are sensitive to both conditional probability and semantic relevance-though to varying degrees depending on architecture and prompting style. A comparison with human data reveals that while LLMs incorporate probabilistic and semantic cues, they do so less consistently than humans. Notably, larger models do not necessarily align more closely with human judgments.

View on arXiv PDF

Similar