CLNov 9, 2020

Automatic Summarization of Open-Domain Podcast Episodes

Kaiqiang Song, Chen Li, Xiaoyang Wang, Dong Yu, Fei Liu

arXiv:2011.04132v21.011 citations

Originality Incremental advance

AI Analysis

This work addresses the need for concise summaries to help users decide whether to listen to podcasts, but it is incremental as it builds on existing neural methods with specific optimizations.

The authors tackled the problem of summarizing open-domain podcast episodes by investigating key aspects of neural abstractive summarization, such as selecting important transcript segments, and achieved a quality rating of 1.559, a 21% improvement over baseline descriptions.

We present implementation details of our abstractive summarizers that achieve competitive results on the Podcast Summarization task of TREC 2020. A concise textual summary that captures important information is crucial for users to decide whether to listen to the podcast. Prior work focuses primarily on learning contextualized representations. Instead, we investigate several less-studied aspects of neural abstractive summarization, including (i) the importance of selecting important segments from transcripts to serve as input to the summarizer; (ii) striking a balance between the amount and quality of training instances; (iii) the appropriate summary length and start/end points. We highlight the design considerations behind our system and offer key insights into the strengths and weaknesses of neural abstractive systems. Our results suggest that identifying important segments from transcripts to use as input to an abstractive summarizer is advantageous for summarizing long documents. Our best system achieves a quality rating of 1.559 judged by NIST evaluators---an absolute increase of 0.268 (+21%) over the creator descriptions.

View on arXiv PDF

Similar