AIFeb 12, 2025

Ensemble based approach to quantifying uncertainty of LLM based classifications

Srijith Rajamohan, Ahmed Salhin, Josh Frazier, Rohit Kumar, Yu-Cheng Tsai, Todd Cook

arXiv:2502.08631v25.81 citationsh-index: 5

Originality Incremental advance

AI Analysis

This addresses uncertainty estimation for users of LLMs in classification tasks, but it appears incremental as it builds on existing finetuning and ensemble methods.

The paper tackles the problem of quantifying uncertainty in LLM-based classifications by proposing that output variance under greedy sampling reflects conceptual certainty and lexical input variance, and it introduces a probabilistic method for estimating class certainties after finetuning reduces sensitivity to lexical variations.

The output of Large Language Models (LLMs) are a function of the internal model's parameters and the input provided into the context window. The hypothesis presented here is that under a greedy sampling strategy the variance in the LLM's output is a function of the conceptual certainty embedded in the model's parametric knowledge, as well as the lexical variance in the input. Finetuning the model results in reducing the sensitivity of the model output to the lexical input variations. This is then applied to a classification problem and a probabilistic method is proposed for estimating the certainties of the predicted classes.

View on arXiv PDF

Similar