LGMay 7

LLMs are not (consistently) Bayesian: Quantifying internal (in)consistencies of LLMs' probabilistic beliefs

Chacha Chen, Matthew Jörke, Adam Goliński, Masha Fedzechkina, Guillermo Sapiro, Sinead Williamson, Nicholas Foti

arXiv:2605.0691588.1

AI Analysis

For researchers and practitioners deploying LLMs in high-stakes domains, this work reveals fundamental inconsistencies in how LLMs represent and update uncertainty, challenging assumptions about their probabilistic reasoning.

The paper investigates whether LLMs update their probabilistic beliefs in a Bayesian manner, finding that while some approaches yield nearly Bayesian updates, others rely on learned heuristics that often outperform exact Bayesian computation due to model misspecification.

Modern AI systems are being deployed in complex domains such as medicine, science, and law, where it is important that they not only produce correct answers, but also represent and update uncertain beliefs about the world as new evidence arrives. We introduce the novel technique of studying LLMs as information processing rules and utilize the information processing gap to study the internal (in)consistencies of how LLMs update their probabilistic beliefs from evidence. Our extensive experiments evaluate multiple approaches in which LLMs can incorporate evidence into their beliefs. Some of these approaches produce (nearly) Bayesian updates; others seem to use a learned heuristic. Surprisingly, the non-Bayesian heuristic updates often outperform exact Bayesian computation in terms of downstream task performance -- indicating the LLMs' probabilistic models of the world are misspecified. Lastly, we show how our measure can provide diagnostics to identify issues with LLM-powered inferential systems.

View on arXiv PDF

Similar