CLOct 6, 2020

SupMMD: A Sentence Importance Model for Extractive Summarization using Maximum Mean Discrepancy

arXiv:2010.02568v1993 citations
Originality Incremental advance
AI Analysis

It addresses the practical need for summarizing updates in evolving topics like news, but is incremental as it builds on existing methods for multi-document summarization.

The paper tackles the problem of update summarization, which identifies new information in evolving document sets, by proposing SupMMD, a technique that combines supervised and unsupervised learning with maximum mean discrepancy. It achieves state-of-the-art results on DUC-2004 and TAC-2009 datasets.

Most work on multi-document summarization has focused on generic summarization of information present in each individual document set. However, the under-explored setting of update summarization, where the goal is to identify the new information present in each set, is of equal practical interest (e.g., presenting readers with updates on an evolving news topic). In this work, we present SupMMD, a novel technique for generic and update summarization based on the maximum mean discrepancy from kernel two-sample testing. SupMMD combines both supervised learning for salience and unsupervised learning for coverage and diversity. Further, we adapt multiple kernel learning to make use of similarity across multiple information sources (e.g., text features and knowledge based concepts). We show the efficacy of SupMMD in both generic and update summarization tasks by meeting or exceeding the current state-of-the-art on the DUC-2004 and TAC-2009 datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes