CLAug 30, 2021

Hallucinated but Factual! Inspecting the Factuality of Hallucinations in Abstractive Summarization

Meng Cao, Yue Dong, Jackie Chi Kit Cheung

arXiv:2109.09784v231.6657 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the issue of factuality in summaries for NLP applications, offering a novel detection approach with incremental improvements.

The paper tackles the problem of hallucinations in abstractive summarization, finding that many are factual and beneficial, and proposes a detection method that outperforms baselines and improves summary factuality when used as a reward signal.

State-of-the-art abstractive summarization systems often generate \emph{hallucinations}; i.e., content that is not directly inferable from the source text. Despite being assumed incorrect, we find that much hallucinated content is factual, namely consistent with world knowledge. These factual hallucinations can be beneficial in a summary by providing useful background information. In this work, we propose a novel detection approach that separates factual from non-factual hallucinations of entities. Our method utilizes an entity's prior and posterior probabilities according to pre-trained and finetuned masked language models, respectively. Empirical results suggest that our approach vastly outperforms two baselines %in both accuracy and F1 scores and strongly correlates with human judgments. % on factuality classification tasks. Furthermore, we show that our detector, when used as a reward signal in an off-line reinforcement learning (RL) algorithm, significantly improves the factuality of summaries while maintaining the level of abstractiveness.

View on arXiv PDF Code

Similar