CLAINov 16, 2023

PELMS: Pre-training for Effective Low-Shot Multi-Document Summarization

arXiv:2311.09836v130 citationsh-index: 5
Originality Incremental advance
AI Analysis

This addresses the challenge of generating concise, fluent, and faithful summaries from multiple documents, which is less studied than single-document summarization, with incremental improvements in pre-training techniques.

The paper tackles the problem of abstractive multi-document summarization (MDS) by introducing PELMS, a pre-trained model that uses semantic coherence and faithfulness objectives, resulting in consistent outperformance of competitive methods in low-shot settings across various datasets.

We investigate pre-training techniques for abstractive multi-document summarization (MDS), which is much less studied than summarizing single documents. Though recent work has demonstrated the effectiveness of highlighting information salience for pre-training strategy design, it struggles to generate abstractive and reflective summaries, which are critical properties for MDS. To this end, we present PELMS, a pre-trained model that uses objectives based on semantic coherence heuristics and faithfulness constraints with un-labeled multi-document inputs, to promote the generation of concise, fluent, and faithful summaries. To support the training of PELMS, we compile MultiPT, a multi-document pre-training corpus containing over 93 million documents to form more than 3 million unlabeled topic-centric document clusters, covering diverse genres such as product reviews, news, and general knowledge. We perform extensive evaluation of PELMS in low-shot settings on a wide range of MDS datasets. Our approach consistently outperforms competitive comparisons with respect to overall informativeness, abstractiveness, coherence, and faithfulness.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes