CLNov 9, 2022

Novel Chapter Abstractive Summarization using Spinal Tree Aware Sub-Sentential Content Selection

Hardy Hardy, Miguel Ballesteros, Faisal Ladhak, Muhammad Khalifa, Vittorio Castelli, Kathleen McKeown

AmazonIBMStanford

arXiv:2211.04903v10.33 citationsh-index: 76

Originality Incremental advance

AI Analysis

This addresses the problem of summarizing lengthy novel chapters for NLP applications, but it is incremental as it builds on existing extractive-abstractive approaches.

The authors tackled novel chapter summarization by developing a pipelined extractive-abstractive method that uses spinal tree information for sub-sentential content selection, achieving a 3.71 Rouge-1 improvement over prior work.

Summarizing novel chapters is a difficult task due to the input length and the fact that sentences that appear in the desired summaries draw content from multiple places throughout the chapter. We present a pipelined extractive-abstractive approach where the extractive step filters the content that is passed to the abstractive component. Extremely lengthy input also results in a highly skewed dataset towards negative instances for extractive summarization; we thus adopt a margin ranking loss for extraction to encourage separation between positive and negative examples. Our extraction component operates at the constituent level; our approach to this problem enriches the text with spinal tree information which provides syntactic context (in the form of constituents) to the extraction model. We show an improvement of 3.71 Rouge-1 points over best results reported in prior work on an existing novel chapter dataset.

View on arXiv PDF

Similar