CLApr 16, 2021

Unsupervised Extractive Summarization by Human Memory Simulation

Ronald Cardenas, Matthias Galle, Shay B. Cohen

arXiv:2104.08392v10.51 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the problem of identifying important information in document summarization for NLP applications, though it appears incremental as it builds on existing cognitive approaches.

The paper tackles content selection in unsupervised extractive summarization of long documents by introducing heuristics based on human memory simulation, finding these effectively identify summary-worthy content in scientific articles through automatic and human evaluations.

Summarization systems face the core challenge of identifying and selecting important information. In this paper, we tackle the problem of content selection in unsupervised extractive summarization of long, structured documents. We introduce a wide range of heuristics that leverage cognitive representations of content units and how these are retained or forgotten in human memory. We find that properties of these representations of human memory can be exploited to capture relevance of content units in scientific articles. Experiments show that our proposed heuristics are effective at leveraging cognitive structures and the organization of the document (i.e.\ sections of an article), and automatic and human evaluations provide strong evidence that these heuristics extract more summary-worthy content units.

View on arXiv PDF Code

Similar