CLAIJul 23, 2025

Are LLM Belief Updates Consistent with Bayes' Theorem?

arXiv:2507.17951v15 citationsh-index: 2
Originality Incremental advance
AI Analysis

This addresses the problem of understanding and governing LLMs' reasoning consistency, but it is incremental as it builds on existing model evaluation methods.

The study tested whether larger language models update beliefs more consistently with Bayes' theorem, finding that larger models show higher Bayesian coherence as measured by a new metric.

Do larger and more capable language models learn to update their "beliefs" about propositions more consistently with Bayes' theorem when presented with evidence in-context? To test this, we formulate a Bayesian Coherence Coefficient (BCC) metric and generate a dataset with which to measure the BCC. We measure BCC for multiple pre-trained-only language models across five model families, comparing against the number of model parameters, the amount of training data, and model scores on common benchmarks. Our results provide evidence for our hypothesis that larger and more capable pre-trained language models assign credences that are more coherent with Bayes' theorem. These results have important implications for our understanding and governance of LLMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes