Entity-level Factual Consistency of Abstractive Text Summarization
This addresses factual consistency for users of abstractive summarization models, but it is incremental as it builds on existing methods.
The paper tackles the problem of entity hallucination in abstractive text summarization by proposing new metrics to quantify entity-level factual consistency and showing that filtering training data alleviates the issue, with further improvements from a classification task and joint generation approach.
A key challenge for abstractive summarization is ensuring factual consistency of the generated summary with respect to the original document. For example, state-of-the-art models trained on existing datasets exhibit entity hallucination, generating names of entities that are not present in the source document. We propose a set of new metrics to quantify the entity-level factual consistency of generated summaries and we show that the entity hallucination problem can be alleviated by simply filtering the training data. In addition, we propose a summary-worthy entity classification task to the training process as well as a joint entity and summary generation approach, which yield further improvements in entity level metrics.