CLMar 6, 2025

HalluCounter: Reference-free LLM Hallucination Detection in the Wild!

arXiv:2503.04615v22 citationsh-index: 4IJCNLP-AACL
Originality Incremental advance
AI Analysis

This addresses the challenge of hallucination detection for users of closed-source LLMs, offering a reference-free method with improved accuracy, though it is incremental as it builds on existing consistency-based approaches.

The paper tackles the problem of detecting hallucinations in large language models without relying on reference data or internal model states, proposing HalluCounter which uses response-response and query-response consistency patterns and achieves over 90% average confidence in detection across datasets.

Response consistency-based, reference-free hallucination detection (RFHD) methods do not depend on internal model states, such as generation probabilities or gradients, which Grey-box models typically rely on but are inaccessible in closed-source LLMs. However, their inability to capture query-response alignment patterns often results in lower detection accuracy. Additionally, the lack of large-scale benchmark datasets spanning diverse domains remains a challenge, as most existing datasets are limited in size and scope. To this end, we propose HalluCounter, a novel reference-free hallucination detection method that utilizes both response-response and query-response consistency and alignment patterns. This enables the training of a classifier that detects hallucinations and provides a confidence score and an optimal response for user queries. Furthermore, we introduce HalluCounterEval, a benchmark dataset comprising both synthetically generated and human-curated samples across multiple domains. Our method outperforms state-of-the-art approaches by a significant margin, achieving over 90\% average confidence in hallucination detection across datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes