CLOct 29, 2024

Distinguishing Ignorance from Error in LLM Hallucinations

arXiv:2410.22071v216 citationsh-index: 55Has Code
Originality Incremental advance
AI Analysis

This work addresses hallucinations in LLMs for users relying on accurate outputs, but it is incremental as it builds on existing detection and mitigation efforts.

The paper tackles the problem of LLM hallucinations by distinguishing between two types: HK- (lack of knowledge) and HK+ (incorrect despite knowledge), finding that HK+ is prevalent and that distinguishing them helps mitigation, with different models hallucinating on different examples.

Large language models (LLMs) are susceptible to hallucinations -- factually incorrect outputs -- leading to a large body of work on detecting and mitigating such cases. We argue that it is important to distinguish between two types of hallucinations: ones where the model does not hold the correct answer in its parameters, which we term HK-, and ones where the model answers incorrectly despite having the required knowledge, termed HK+. We first find that HK+ hallucinations are prevalent and occur across models and datasets. Then, we demonstrate that distinguishing between these two cases is beneficial for mitigating hallucinations. Importantly, we show that different models hallucinate on different examples, which motivates constructing model-specific hallucination datasets for training detectors. Overall, our findings draw attention to classifying types of hallucinations and provide means to handle them more effectively. The code is available at https://github.com/technion-cs-nlp/hallucination-mitigation .

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes