CLDec 28, 2020

On Generating Extended Summaries of Long Documents

arXiv:2012.14136v119 citationsHas Code
AI Analysis

This work is significant for researchers and practitioners who need more detailed summaries of long documents like research papers or legal texts, providing an incremental improvement in long-form summary generation.

This paper addresses the generation of extended summaries for long documents, which often contain more detailed information than short summaries can accommodate. The authors propose a new method that leverages the hierarchical structure of documents and integrates it into an extractive summarization model using a multi-task learning approach, achieving performance that outperforms or matches strong baselines on three long summarization datasets.

Prior work in document summarization has mainly focused on generating short summaries of a document. While this type of summary helps get a high-level view of a given document, it is desirable in some cases to know more detailed information about its salient points that can't fit in a short summary. This is typically the case for longer documents such as a research paper, legal document, or a book. In this paper, we present a new method for generating extended summaries of long papers. Our method exploits hierarchical structure of the documents and incorporates it into an extractive summarization model through a multi-task learning approach. We then present our results on three long summarization datasets, arXiv-Long, PubMed-Long, and Longsumm. Our method outperforms or matches the performance of strong baselines. Furthermore, we perform a comprehensive analysis over the generated results, shedding insights on future research for long-form summary generation task. Our analysis shows that our multi-tasking approach can adjust extraction probability distribution to the favor of summary-worthy sentences across diverse sections. Our datasets, and codes are publicly available at https://github.com/Georgetown-IR-Lab/ExtendedSumm

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes