CL AIJan 30, 2024

Incoherent Probability Judgments in Large Language Models

arXiv:2401.16646v211.523 citationsh-index: 7CogSci

Originality Incremental advance

AI Analysis

This reveals a key limitation in LLMs' reasoning abilities, which is important for researchers and developers relying on them for probabilistic tasks.

The study investigated whether large language models (LLMs) produce coherent probability judgments, finding that they often exhibit incoherent, human-like deviations from probability theory, such as an inverted-U-shaped mean-variance relationship.

Autoregressive Large Language Models (LLMs) trained for next-word prediction have demonstrated remarkable proficiency at producing coherent text. But are they equally adept at forming coherent probability judgments? We use probabilistic identities and repeated judgments to assess the coherence of probability judgments made by LLMs. Our results show that the judgments produced by these models are often incoherent, displaying human-like systematic deviations from the rules of probability theory. Moreover, when prompted to judge the same event, the mean-variance relationship of probability judgments produced by LLMs shows an inverted-U-shaped like that seen in humans. We propose that these deviations from rationality can be explained by linking autoregressive LLMs to implicit Bayesian inference and drawing parallels with the Bayesian Sampler model of human probability judgments.

View on arXiv PDF

Similar