CLAIMay 10, 2025

Recovering Event Probabilities from Large Language Model Embeddings via Axiomatic Constraints

arXiv:2505.07883v11 citationsh-index: 7
Originality Incremental advance
AI Analysis

This addresses the issue of unreliable probability estimates for events involving uncertainty, which is crucial for rational decision-making, though it is incremental as it builds on existing VAE methods.

The paper tackled the problem of incoherent event probabilities generated by Large Language Models (LLMs) by recovering coherent probabilities from LLM embeddings using axiomatic constraints, resulting in probabilities that are more coherent and align closely with true probabilities than those directly reported by the models.

Rational decision-making under uncertainty requires coherent degrees of belief in events. However, event probabilities generated by Large Language Models (LLMs) have been shown to exhibit incoherence, violating the axioms of probability theory. This raises the question of whether coherent event probabilities can be recovered from the embeddings used by the models. If so, those derived probabilities could be used as more accurate estimates in events involving uncertainty. To explore this question, we propose enforcing axiomatic constraints, such as the additive rule of probability theory, in the latent space learned by an extended variational autoencoder (VAE) applied to LLM embeddings. This approach enables event probabilities to naturally emerge in the latent space as the VAE learns to both reconstruct the original embeddings and predict the embeddings of semantically related events. We evaluate our method on complementary events (i.e., event A and its complement, event not-A), where the true probabilities of the two events must sum to 1. Experiment results on open-weight language models demonstrate that probabilities recovered from embeddings exhibit greater coherence than those directly reported by the corresponding models and align closely with the true probabilities.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes