CLMay 4, 2022

Masked Summarization to Generate Factually Inconsistent Summaries for Improved Factual Consistency Checking

Hwanhee Lee, Kang Min Yoo, Joonsuk Park, Hwaran Lee, Kyomin Jung

arXiv:2205.02035v131.8630 citationsh-index: 29Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge of evaluating factual consistency in summarization systems, which is crucial for improving reliability in NLP applications, though it is an incremental advance in data generation for training.

The paper tackles the problem of generating factually inconsistent summaries to train better factual consistency classifiers for abstractive summarization, showing that classifiers trained with their method outperform existing models on seven benchmark datasets.

Despite the recent advances in abstractive summarization systems, it is still difficult to determine whether a generated summary is factual consistent with the source text. To this end, the latest approach is to train a factual consistency classifier on factually consistent and inconsistent summaries. Luckily, the former is readily available as reference summaries in existing summarization datasets. However, generating the latter remains a challenge, as they need to be factually inconsistent, yet closely relevant to the source text to be effective. In this paper, we propose to generate factually inconsistent summaries using source texts and reference summaries with key information masked. Experiments on seven benchmark datasets demonstrate that factual consistency classifiers trained on summaries generated using our method generally outperform existing models and show a competitive correlation with human judgments. We also analyze the characteristics of the summaries generated using our method. We will release the pre-trained model and the code at https://github.com/hwanheelee1993/MFMA.

View on arXiv PDF Code

Similar