CL AIMay 27, 2025

VeriTrail: Closed-Domain Hallucination Detection with Traceability

arXiv:2505.21786v16.73 citationsh-index: 1

Originality Incremental advance

AI Analysis

This addresses the issue of unsubstantiated content generation in AI systems, particularly for applications requiring reliability, but it is incremental as it builds on existing hallucination detection efforts.

The paper tackles the problem of closed-domain hallucination in language models, especially in multi-step generative processes, by introducing VeriTrail, a detection method that provides traceability and outperforms baselines on new datasets.

Even when instructed to adhere to source material, Language Models often generate unsubstantiated content - a phenomenon known as "closed-domain hallucination." This risk is amplified in processes with multiple generative steps (MGS), compared to processes with a single generative step (SGS). However, due to the greater complexity of MGS processes, we argue that detecting hallucinations in their final outputs is necessary but not sufficient: it is equally important to trace where hallucinated content was likely introduced and how faithful content may have been derived from the source through intermediate outputs. To address this need, we present VeriTrail, the first closed-domain hallucination detection method designed to provide traceability for both MGS and SGS processes. We also introduce the first datasets to include all intermediate outputs as well as human annotations of final outputs' faithfulness for their respective MGS processes. We demonstrate that VeriTrail outperforms baseline methods on both datasets.

View on arXiv PDF

Similar