CLMay 4, 2020

Exploring Content Selection in Summarization of Novel Chapters

Faisal Ladhak, Bryan Li, Yaser Al-Onaizan, Kathleen McKeown

arXiv:2005.01840v331.31006 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses a domain-specific summarization challenge for literary analysis, but it is incremental as it builds on existing extractive methods with new data and metrics.

The authors tackled the problem of generating summaries for novel chapters, a harder task than news summarization due to chapter length and extreme paraphrasing, and achieved significant improvement over prior alignment methods as shown through automatic metrics and crowd-sourced analysis.

We present a new summarization task, generating summaries of novel chapters using summary/chapter pairs from online study guides. This is a harder task than the news summarization task, given the chapter length as well as the extreme paraphrasing and generalization found in the summaries. We focus on extractive summarization, which requires the creation of a gold-standard set of extractive summaries. We present a new metric for aligning reference summary sentences with chapter sentences to create gold extracts and also experiment with different alignment methods. Our experiments demonstrate significant improvement over prior alignment approaches for our task as shown through automatic metrics and a crowd-sourced pyramid analysis. We make our data collection scripts available at https://github.com/manestay/novel-chapter-dataset .

View on arXiv PDF Code

Similar