Sentence Centrality Revisited for Unsupervised Summarization
This work addresses the challenge of creating high-quality summaries without large-scale training data, which is incremental as it builds on existing graph-based methods with neural enhancements.
The authors tackled the problem of single document summarization by developing an unsupervised approach that modifies a graph-based ranking algorithm with BERT for sentence representation and directed edges based on document position, achieving performance that outperforms strong baselines by a wide margin on three news datasets.
Single document summarization has enjoyed renewed interests in recent years thanks to the popularity of neural network models and the availability of large-scale datasets. In this paper we develop an unsupervised approach arguing that it is unrealistic to expect large-scale and high-quality training data to be available or created for different types of summaries, domains, or languages. We revisit a popular graph-based ranking algorithm and modify how node (aka sentence) centrality is computed in two ways: (a)~we employ BERT, a state-of-the-art neural representation learning model to better capture sentential meaning and (b)~we build graphs with directed edges arguing that the contribution of any two nodes to their respective centrality is influenced by their relative position in a document. Experimental results on three news summarization datasets representative of different languages and writing styles show that our approach outperforms strong baselines by a wide margin.