CL AIApr 14, 2025

Hallucination Detection in LLMs with Topological Divergence on Attention Graphs

Alexandra Bazarova, Aleksandr Yugay, Andrey Shulga, Alina Ermilova, Andrei Volodichev, Konstantin Polev, Julia Belikova, Rauf Parchiev, Dmitry Simakov, Maxim Savchenko, Andrey Savchenko, Serguei Barannikov

arXiv:2504.10063v317.013 citationsh-index: 4

Originality Incremental advance

AI Analysis

This addresses the critical challenge of factual reliability in LLMs for users in retrieval-augmented generation settings, though it is incremental as it builds on existing attention-based methods.

The paper tackled the problem of detecting hallucinations in large language models by introducing TOHA, a topology-based detector that uses a divergence metric on attention graphs, achieving state-of-the-art or competitive results on benchmarks like question answering and summarization with minimal data and resources.

Hallucination, i.e., generating factually incorrect content, remains a critical challenge for large language models (LLMs). We introduce TOHA, a TOpology-based HAllucination detector in the RAG setting, which leverages a topological divergence metric to quantify the structural properties of graphs induced by attention matrices. Examining the topological divergence between prompt and response subgraphs reveals consistent patterns: higher divergence values in specific attention heads correlate with hallucinated outputs, independent of the dataset. Extensive experiments - including evaluation on question answering and summarization tasks - show that our approach achieves state-of-the-art or competitive results on several benchmarks while requiring minimal annotated data and computational resources. Our findings suggest that analyzing the topological structure of attention matrices can serve as an efficient and robust indicator of factual reliability in LLMs.

View on arXiv PDF

Similar