AICLJun 23, 2025

AggTruth: Contextual Hallucination Detection using Aggregated Attention Scores in LLMs

arXiv:2506.18628v13 citationsh-index: 4ICCS
Originality Incremental advance
AI Analysis

This addresses the challenge of hallucination in LLMs for real-world deployment, particularly in Retrieval-Augmented Generation settings, though it appears incremental as it builds on existing attention-based detection methods.

The paper tackles the problem of contextual hallucinations in Large Language Models (LLMs) by introducing AggTruth, a method that detects these hallucinations using aggregated internal attention scores, and it outperforms the current state-of-the-art in multiple scenarios with stable performance across LLMs.

In real-world applications, Large Language Models (LLMs) often hallucinate, even in Retrieval-Augmented Generation (RAG) settings, which poses a significant challenge to their deployment. In this paper, we introduce AggTruth, a method for online detection of contextual hallucinations by analyzing the distribution of internal attention scores in the provided context (passage). Specifically, we propose four different variants of the method, each varying in the aggregation technique used to calculate attention scores. Across all LLMs examined, AggTruth demonstrated stable performance in both same-task and cross-task setups, outperforming the current SOTA in multiple scenarios. Furthermore, we conducted an in-depth analysis of feature selection techniques and examined how the number of selected attention heads impacts detection performance, demonstrating that careful selection of heads is essential to achieve optimal results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes