CLMay 6, 2025

Sentence Embeddings as an intermediate target in end-to-end summarisation

arXiv:2505.03481v11 citationsh-index: 12
Originality Incremental advance
AI Analysis

This work addresses summarization difficulties for large datasets, specifically in the domain of user reviews, with incremental improvements over prior methods.

The paper tackles the challenge of content selection in end-to-end summarization of large user reviews by combining extractive methods with pre-trained sentence embeddings, showing that this approach outperforms existing methods and improves summary quality for loosely aligned corpora.

Current neural network-based methods to the problem of document summarisation struggle when applied to datasets containing large inputs. In this paper we propose a new approach to the challenge of content-selection when dealing with end-to-end summarisation of user reviews of accommodations. We show that by combining an extractive approach with externally pre-trained sentence level embeddings in an addition to an abstractive summarisation model we can outperform existing methods when this is applied to the task of summarising a large input dataset. We also prove that predicting sentence level embedding of a summary increases the quality of an end-to-end system for loosely aligned source to target corpora, than compared to commonly predicting probability distributions of sentence selection.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes