CLAug 23, 2019

Neural Text Summarization: A Critical Evaluation

arXiv:1908.08960v11132 citations
AI Analysis

This work addresses fundamental issues in text summarization research for the NLP community, highlighting incremental insights into dataset and evaluation limitations.

The paper critically evaluates neural text summarization, identifying key shortcomings in datasets, evaluation metrics, and models that lead to stagnation in benchmark progress, such as noise in data and weak correlation with human judgment.

Text summarization aims at compressing long documents into a shorter form that conveys the most important parts of the original document. Despite increased interest in the community and notable research effort, progress on benchmark datasets has stagnated. We critically evaluate key ingredients of the current research setup: datasets, evaluation metrics, and models, and highlight three primary shortcomings: 1) automatically collected datasets leave the task underconstrained and may contain noise detrimental to training and evaluation, 2) current evaluation protocol is weakly correlated with human judgment and does not account for important characteristics such as factual correctness, 3) models overfit to layout biases of current datasets and offer limited diversity in their outputs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes