CLNov 11, 2022

Improving Factual Consistency in Summarization with Compression-Based Post-Editing

Alexander R. Fabbri, Prafulla Kumar Choubey, Jesse Vig, Chien-Sheng Wu, Caiming Xiong

Salesforce

arXiv:2211.06196v124.3296 citationsh-index: 38Has Code

Originality Incremental advance

AI Analysis

This addresses the issue of factual errors in summaries for users relying on accurate information, though it is incremental as it builds on existing post-editing approaches.

The paper tackles the problem of factual inconsistency in summarization models by proposing a compression-based post-editing method to remove extrinsic entity errors, resulting in up to 30% improvement in entity precision on XSum and up to 38% when combined with another post-editor.

State-of-the-art summarization models still struggle to be factually consistent with the input text. A model-agnostic way to address this problem is post-editing the generated summaries. However, existing approaches typically fail to remove entity errors if a suitable input entity replacement is not available or may insert erroneous content. In our work, we focus on removing extrinsic entity errors, or entities not in the source, to improve consistency while retaining the summary's essential information and form. We propose to use sentence-compression data to train the post-editing model to take a summary with extrinsic entity errors marked with special tokens and output a compressed, well-formed summary with those errors removed. We show that this model improves factual consistency while maintaining ROUGE, improving entity precision by up to 30% on XSum, and that this model can be applied on top of another post-editor, improving entity precision by up to a total of 38%. We perform an extensive comparison of post-editing approaches that demonstrate trade-offs between factual consistency, informativeness, and grammaticality, and we analyze settings where post-editors show the largest improvements.

View on arXiv PDF Code

Similar