CLFeb 19, 2024

TrustScore: Reference-Free Evaluation of LLM Response Trustworthiness

arXiv:2402.12545v218 citationsh-index: 86
AI Analysis

This addresses the issue of non-experts struggling to identify inaccuracies in LLM outputs, particularly in closed-book tasks, though it is incremental as it builds on existing evaluation methods.

The paper tackles the problem of evaluating LLM response trustworthiness in closed-book QA by introducing TrustScore, a framework based on Behavioral Consistency that assesses alignment with intrinsic knowledge and integrates with fact-checking; it achieves strong correlations with human judgments, surpassing reference-free metrics and matching reference-based ones.

Large Language Models (LLMs) have demonstrated impressive capabilities across various domains, prompting a surge in their practical applications. However, concerns have arisen regarding the trustworthiness of LLMs outputs, particularly in closed-book question-answering tasks, where non-experts may struggle to identify inaccuracies due to the absence of contextual or ground truth information. This paper introduces TrustScore, a framework based on the concept of Behavioral Consistency, which evaluates whether an LLMs response aligns with its intrinsic knowledge. Additionally, TrustScore can seamlessly integrate with fact-checking methods, which assesses alignment with external knowledge sources. The experimental results show that TrustScore achieves strong correlations with human judgments, surpassing existing reference-free metrics, and achieving results on par with reference-based metrics.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes