CLIRAug 27, 2019

Facet-Aware Evaluation for Extractive Summarization

arXiv:1908.10383v21001 citationsHas Code
AI Analysis

This work addresses the need for more accurate evaluation metrics in extractive summarization, which is important for researchers and practitioners in natural language processing, though it is incremental as it builds on existing evaluation methods.

The paper tackles the problem of evaluating extractive summarization by proposing a facet-aware evaluation setup that assesses information coverage based on semantic facets rather than lexical overlap, and demonstrates that this approach correlates better with human judgment than ROUGE metrics.

Commonly adopted metrics for extractive summarization focus on lexical overlap at the token level. In this paper, we present a facet-aware evaluation setup for better assessment of the information coverage in extracted summaries. Specifically, we treat each sentence in the reference summary as a \textit{facet}, identify the sentences in the document that express the semantics of each facet as \textit{support sentences} of the facet, and automatically evaluate extractive summarization methods by comparing the indices of extracted sentences and support sentences of all the facets in the reference summary. To facilitate this new evaluation setup, we construct an extractive version of the CNN/Daily Mail dataset and perform a thorough quantitative investigation, through which we demonstrate that facet-aware evaluation manifests better correlation with human judgment than ROUGE, enables fine-grained evaluation as well as comparative analysis, and reveals valuable insights of state-of-the-art summarization methods. Data can be found at https://github.com/morningmoni/FAR.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes