Are Hallucinations Bad Estimations?
This reframes hallucinations as a structural misalignment between loss minimization and human-acceptable outputs, which is a foundational problem for AI safety and reliability.
The paper formalizes hallucinations in generative models as failures to link estimates to plausible causes, showing that even optimal loss-minimizing estimators still hallucinate, with a general high-probability lower bound on hallucination rates for generic data distributions.
We formalize hallucinations in generative models as failures to link an estimate to any plausible cause. Under this interpretation, we show that even loss-minimizing optimal estimators still hallucinate. We confirm this with a general high probability lower bound on hallucinate rate for generic data distributions. This reframes hallucination as structural misalignment between loss minimization and human-acceptable outputs, and hence estimation errors induced by miscalibration. Experiments on coin aggregation, open-ended QA, and text-to-image support our theory.