CLDec 25, 2019

Leveraging Lead Bias for Zero-shot Abstractive News Summarization

arXiv:1912.11602v420 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of generating high-quality news summaries without labeled data, which is incremental as it adapts existing models like BART and T5 using a novel pre-training strategy.

The paper tackled the problem of zero-shot abstractive news summarization by leveraging lead bias in news articles for self-supervised pre-training, resulting in state-of-the-art performance with a 13.7% increase in ROUGE-1 score on DUC2003 without fine-tuning.

A typical journalistic convention in news articles is to deliver the most salient information in the beginning, also known as the lead bias. While this phenomenon can be exploited in generating a summary, it has a detrimental effect on teaching a model to discriminate and extract important information in general. We propose that this lead bias can be leveraged in our favor in a simple and effective way to pre-train abstractive news summarization models on large-scale unlabeled news corpora: predicting the leading sentences using the rest of an article. We collect a massive news corpus and conduct data cleaning and filtering via statistical analysis. We then apply self-supervised pre-training on this dataset to existing generation models BART and T5 for domain adaptation. Via extensive experiments on six benchmark datasets, we show that this approach can dramatically improve the summarization quality and achieve state-of-the-art results for zero-shot news summarization without any fine-tuning. For example, in the DUC2003 dataset, the ROUGE-1 score of BART increases 13.7% after the lead-bias pre-training. We deploy the model in Microsoft News and provide public APIs as well as a demo website for multi-lingual news summarization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes