CLNov 18, 2021

SummaC: Re-Visiting NLI-based Models for Inconsistency Detection in Summarization

Philippe Laban, Tobias Schnabel, Paul N. Bennett, Marti A. Hearst

arXiv:2111.09525v133.6692 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the need for accurate inconsistency detection in summarization, which is crucial for ensuring factual reliability in automated summaries, representing a strong specific gain in a domain-specific area.

The paper tackled the problem of detecting factual inconsistencies in summarization by revisiting natural language inference (NLI) models, which previously underperformed due to a granularity mismatch, and introduced SummaCConv to adapt them, achieving state-of-the-art results with a 5% point improvement to 74.4% balanced accuracy on a new benchmark.

In the summarization domain, a key requirement for summaries is to be factually consistent with the input document. Previous work has found that natural language inference (NLI) models do not perform competitively when applied to inconsistency detection. In this work, we revisit the use of NLI for inconsistency detection, finding that past work suffered from a mismatch in input granularity between NLI datasets (sentence-level), and inconsistency detection (document level). We provide a highly effective and light-weight method called SummaCConv that enables NLI models to be successfully used for this task by segmenting documents into sentence units and aggregating scores between pairs of sentences. On our newly introduced benchmark called SummaC (Summary Consistency) consisting of six large inconsistency detection datasets, SummaCConv obtains state-of-the-art results with a balanced accuracy of 74.4%, a 5% point improvement compared to prior work. We make the models and datasets available: https://github.com/tingofurro/summac

View on arXiv PDF Code

Similar