CLJul 1, 2024

LLM Uncertainty Quantification through Directional Entailment Graph and Claim Level Response Augmentation

arXiv:2407.00994v228 citationsh-index: 9
Originality Incremental advance
AI Analysis

This addresses the trustworthiness problem for users relying on LLMs as decision assistants or explainers, representing an incremental improvement over existing uncertainty quantification methods.

This paper tackles the problem of quantifying uncertainty in large language model responses by constructing a directional graph from entailment probabilities and applying Random Walk Laplacian to derive uncertainty metrics, while also proposing an augmentation method to address vagueness in responses. The authors conducted extensive experiments and demonstrated the superiority of their proposed solutions.

The Large language models (LLMs) have showcased superior capabilities in sophisticated tasks across various domains, stemming from basic question-answer (QA), they are nowadays used as decision assistants or explainers for unfamiliar content. However, they are not always correct due to the data sparsity in specific domain corpus, or the model's hallucination problems. Given this, how much should we trust the responses from LLMs? This paper presents a novel way to evaluate the uncertainty that captures the directional instability, by constructing a directional graph from entailment probabilities, and we innovatively conduct Random Walk Laplacian given the asymmetric property of a constructed directed graph, then the uncertainty is aggregated by the derived eigenvalues from the Laplacian process. We also provide a way to incorporate the existing work's semantics uncertainty with our proposed layer. Besides, this paper identifies the vagueness issues in the raw response set and proposes an augmentation approach to mitigate such a problem, we conducted extensive empirical experiments and demonstrated the superiority of our proposed solutions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes