CLJan 20, 2023

Document Summarization with Text Segmentation

arXiv:2301.08817v11.74 citationsh-index: 8

Originality Synthesis-oriented

AI Analysis

This work addresses the lead bias problem in extractive summarization for scientific articles, but it is incremental as it builds on existing segmentation and summarization methods.

The paper tackled extractive summarization by leveraging document segmentation to reduce lead bias, showing that using a highly accurate segmentation method improves performance, particularly when relevant information is not at the beginning of documents.

In this paper, we exploit the innate document segment structure for improving the extractive summarization task. We build two text segmentation models and find the most optimal strategy to introduce their output predictions in an extractive summarization model. Experimental results on a corpus of scientific articles show that extractive summarization benefits from using a highly accurate segmentation method. In particular, most of the improvement is in documents where the most relevant information is not at the beginning thus, we conclude that segmentation helps in reducing the lead bias problem.

View on arXiv PDF

Similar