Improving Factual Consistency in Summarization with Compression-Based Post-Editing
This addresses the issue of factual errors in summaries for users relying on accurate information, though it is incremental as it builds on existing post-editing approaches.
The paper tackles the problem of factual inconsistency in summarization models by proposing a compression-based post-editing method to remove extrinsic entity errors, resulting in up to 30% improvement in entity precision on XSum and up to 38% when combined with another post-editor.
State-of-the-art summarization models still struggle to be factually consistent with the input text. A model-agnostic way to address this problem is post-editing the generated summaries. However, existing approaches typically fail to remove entity errors if a suitable input entity replacement is not available or may insert erroneous content. In our work, we focus on removing extrinsic entity errors, or entities not in the source, to improve consistency while retaining the summary's essential information and form. We propose to use sentence-compression data to train the post-editing model to take a summary with extrinsic entity errors marked with special tokens and output a compressed, well-formed summary with those errors removed. We show that this model improves factual consistency while maintaining ROUGE, improving entity precision by up to 30% on XSum, and that this model can be applied on top of another post-editor, improving entity precision by up to a total of 38%. We perform an extensive comparison of post-editing approaches that demonstrate trade-offs between factual consistency, informativeness, and grammaticality, and we analyze settings where post-editors show the largest improvements.