LGAICLMay 14

When Answers Stray from Questions: Hallucination Detection via Question-Answer Orthogonal Decomposition

arXiv:2605.1444956.0
AI Analysis

For LLM users and developers, QAOD provides an efficient and robust hallucination detection method that generalizes across domains, addressing the trade-off between accuracy and efficiency.

QAOD proposes a single-pass framework for LLM hallucination detection that projects away question-aligned directions from answer representations, achieving best in-domain AUROC and up to 21% improvement in OOD transfer on BioASQ at under 25% generation cost.

Hallucination detection in large language models (LLMs) requires balancing accu racy, efficiency, and robustness to distribution shift. Black-box consistency methods are effective but demand repeated inference; single-pass white-box probes are effi cient yet treat answer representations in isolation, often degrading sharply under domain shift. We propose QAOD (Question-Answer Orthogonal Decomposition), a single-pass framework that projects away the question-aligned direction from the answer representation to obtain a question-orthogonal component that suppresses domain-conditioned variation. To identify informative signals, QAOD further selects layers via diversity-penalized Fisher scoring and discriminative neurons via Fisher importance. To address both in-domain detection and cross-domain generalization, we design two complementary probing strategies: pairing the or thogonal component with question context yields a joint probe that maximizes in-domain discriminability, while using the orthogonal component alone preserves domain-agnostic factuality signals for robust transfer. QAOD's joint probe achieves the best in-domain AUROC across all evaluated model-dataset pairs, while the orthogonal-only probe delivers the strongest OOD transfer, surpassing the best white-box baseline by up to 21% on BioASQ at under 25% of generation cost.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes