CLMay 2, 2023

DiffuSum: Generation Enhanced Extractive Summarization with Diffusion

arXiv:2305.01735v2236 citationsHas Code
AI Analysis

This work addresses the problem of extractive summarization for NLP researchers and practitioners by introducing a generative approach, showing potential for adaptation but is incremental in its application to a specific task.

The paper tackles extractive summarization by proposing DiffuSum, a novel paradigm that uses diffusion models to generate summary sentence representations and matches them to source sentences, achieving state-of-the-art results on CNN/DailyMail with ROUGE scores of 44.83/22.56/40.56.

Extractive summarization aims to form a summary by directly extracting sentences from the source document. Existing works mostly formulate it as a sequence labeling problem by making individual sentence label predictions. This paper proposes DiffuSum, a novel paradigm for extractive summarization, by directly generating the desired summary sentence representations with diffusion models and extracting sentences based on sentence representation matching. In addition, DiffuSum jointly optimizes a contrastive sentence encoder with a matching loss for sentence representation alignment and a multi-class contrastive loss for representation diversity. Experimental results show that DiffuSum achieves the new state-of-the-art extractive results on CNN/DailyMail with ROUGE scores of $44.83/22.56/40.56$. Experiments on the other two datasets with different summary lengths also demonstrate the effectiveness of DiffuSum. The strong performance of our framework shows the great potential of adapting generative models for extractive summarization. To encourage more following work in the future, we have released our codes at \url{https://github.com/hpzhang94/DiffuSum}

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes