Towards Supervised Extractive Text Summarization via RNN-based Sequence Classification
This work addresses the problem of data scarcity in extractive summarization for researchers, but it is incremental as it applies an existing RNN-based classification approach with a data generation technique.
The authors tackled extractive text summarization by training an RNN to classify sentences as belonging to a summary or not, using a method to generate extractive summaries from abstractive ones in the CNN corpus to address data scarcity.
This article briefly explains our submitted approach to the DocEng'19 competition on extractive summarization. We implemented a recurrent neural network based model that learns to classify whether an article's sentence belongs to the corresponding extractive summary or not. We bypass the lack of large annotated news corpora for extractive summarization by generating extractive summaries from abstractive ones, which are available from the CNN corpus.