Neural Summarization by Extracting Sentences and Words
This addresses the problem of automated text summarization for NLP applications, presenting an incremental improvement over traditional feature-based methods.
The authors tackled extractive summarization by proposing a neural network framework with a hierarchical encoder and attention-based extractor, achieving results comparable to state-of-the-art on two datasets without linguistic annotation.
Traditional approaches to extractive summarization rely heavily on human-engineered features. In this work we propose a data-driven approach based on neural networks and continuous sentence features. We develop a general framework for single-document summarization composed of a hierarchical document encoder and an attention-based extractor. This architecture allows us to develop different classes of summarization models which can extract sentences or words. We train our models on large scale corpora containing hundreds of thousands of document-summary pairs. Experimental results on two summarization datasets demonstrate that our models obtain results comparable to the state of the art without any access to linguistic annotation.