CLNov 6, 2018

DeepChannel: Salience Estimation by Contrastive Learning for Extractive Document Summarization

arXiv:1811.02394v231 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of data-efficient and robust summarization for NLP applications, though it is incremental as it builds on existing attention-based and contrastive learning methods.

The authors tackled extractive document summarization by proposing DeepChannel, a model that estimates salience scores using contrastive learning, achieving state-of-the-art ROUGE scores on CNN/Daily Mail and strong out-of-domain performance on DUC2007, with a ROUGE-1 F-1 score of 39.41 using only 1/100 of the training data.

We propose DeepChannel, a robust, data-efficient, and interpretable neural model for extractive document summarization. Given any document-summary pair, we estimate a salience score, which is modeled using an attention-based deep neural network, to represent the salience degree of the summary for yielding the document. We devise a contrastive training strategy to learn the salience estimation network, and then use the learned salience score as a guide and iteratively extract the most salient sentences from the document as our generated summary. In experiments, our model not only achieves state-of-the-art ROUGE scores on CNN/Daily Mail dataset, but also shows strong robustness in the out-of-domain test on DUC2007 test set. Moreover, our model reaches a ROUGE-1 F-1 score of 39.41 on CNN/Daily Mail test set with merely $1 / 100$ training set, demonstrating a tremendous data efficiency.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes