Sentence Embeddings as an intermediate target in end-to-end summarisation
This work addresses summarization difficulties for large datasets, specifically in the domain of user reviews, with incremental improvements over prior methods.
The paper tackles the challenge of content selection in end-to-end summarization of large user reviews by combining extractive methods with pre-trained sentence embeddings, showing that this approach outperforms existing methods and improves summary quality for loosely aligned corpora.
Current neural network-based methods to the problem of document summarisation struggle when applied to datasets containing large inputs. In this paper we propose a new approach to the challenge of content-selection when dealing with end-to-end summarisation of user reviews of accommodations. We show that by combining an extractive approach with externally pre-trained sentence level embeddings in an addition to an abstractive summarisation model we can outperform existing methods when this is applied to the task of summarising a large input dataset. We also prove that predicting sentence level embedding of a summary increases the quality of an end-to-end system for loosely aligned source to target corpora, than compared to commonly predicting probability distributions of sentence selection.