CLOct 24, 2020

Constrained Abstractive Summarization: Preserving Factual Consistency with Constrained Generation

Yuning Mao, Xiang Ren, Heng Ji, Jiawei Han

arXiv:2010.12723v23.941 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses factual inconsistency in abstractive summarization for NLP applications, offering an incremental improvement over existing methods.

The paper tackles the problem of hallucination in abstractive summarization by proposing Constrained Abstractive Summarization (CAS), which uses token constraints to improve factual consistency, resulting in up to 13.8 ROUGE-2 gains in interactive scenarios.

Despite significant progress, state-of-the-art abstractive summarization methods are still prone to hallucinate content inconsistent with the source document. In this paper, we propose Constrained Abstractive Summarization (CAS), a general setup that preserves the factual consistency of abstractive summarization by specifying tokens as constraints that must be present in the summary. We adopt lexically constrained decoding, a technique generally applicable to autoregressive generative models, to fulfill CAS and conduct experiments in two scenarios: (1) automatic summarization without human involvement, where keyphrases are extracted from the source document and used as constraints; (2) human-guided interactive summarization, where human feedback in the form of manual constraints are used to guide summary generation. Automatic and human evaluations on two benchmark datasets demonstrate that CAS improves both lexical overlap (ROUGE) and factual consistency of abstractive summarization. In particular, we observe up to 13.8 ROUGE-2 gains when only one manual constraint is used in interactive summarization.

View on arXiv PDF Code

Similar