CLApr 17, 2025

Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations

CMU
arXiv:2504.12691v112 citationsh-index: 27Has Code
Originality Incremental advance
AI Analysis

This work addresses the challenge of diagnosing hallucinations in LLMs, which is crucial for improving model reliability in applications like content generation and fact-checking, though it is incremental in providing a new analytical tool.

The paper tackles the problem of hallucinations in large language models by introducing a subsequence association framework to trace and understand their causes, demonstrating that the method outperforms standard attribution techniques in identifying hallucination origins.

Large language models (LLMs) frequently generate hallucinations-content that deviates from factual accuracy or provided context-posing challenges for diagnosis due to the complex interplay of underlying causes. This paper introduces a subsequence association framework to systematically trace and understand hallucinations. Our key insight is that hallucinations arise when dominant hallucinatory associations outweigh faithful ones. Through theoretical and empirical analyses, we demonstrate that decoder-only transformers effectively function as subsequence embedding models, with linear layers encoding input-output associations. We propose a tracing algorithm that identifies causal subsequences by analyzing hallucination probabilities across randomized input contexts. Experiments show our method outperforms standard attribution techniques in identifying hallucination causes and aligns with evidence from the model's training corpus. This work provides a unified perspective on hallucinations and a robust framework for their tracing and analysis.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes