Confabulation: The Surprising Value of Large Language Model Hallucinations
This challenges the standard view in AI research that hallucinations are flaws, suggesting they may enhance narrative generation, but it is incremental as it reinterprets existing phenomena without new methods.
The paper argues that large language model hallucinations, or confabulations, are not inherently problematic but can be a resource, as they exhibit increased narrativity and semantic coherence compared to veridical outputs, based on analysis of hallucination benchmarks.
This paper presents a systematic defense of large language model (LLM) hallucinations or 'confabulations' as a potential resource instead of a categorically negative pitfall. The standard view is that confabulations are inherently problematic and AI research should eliminate this flaw. In this paper, we argue and empirically demonstrate that measurable semantic characteristics of LLM confabulations mirror a human propensity to utilize increased narrativity as a cognitive resource for sense-making and communication. In other words, it has potential value. Specifically, we analyze popular hallucination benchmarks and reveal that hallucinated outputs display increased levels of narrativity and semantic coherence relative to veridical outputs. This finding reveals a tension in our usually dismissive understandings of confabulation. It suggests, counter-intuitively, that the tendency for LLMs to confabulate may be intimately associated with a positive capacity for coherent narrative-text generation.