EntSUM: A Data Set for Entity-Centric Summarization
This work addresses the need for more tailored summaries in information retrieval, but it is incremental as it builds on existing summarization methods with a new dataset.
The authors tackled the problem of generating entity-centric summaries by introducing EntSUM, a human-annotated dataset for controllable summarization, and showed that existing methods fail on this task, with proposed extensions achieving substantially better results.
Controllable summarization aims to provide summaries that take into account user-specified aspects and preferences to better assist them with their information need, as opposed to the standard summarization setup which build a single generic summary of a document. We introduce a human-annotated data set EntSUM for controllable summarization with a focus on named entities as the aspects to control. We conduct an extensive quantitative analysis to motivate the task of entity-centric summarization and show that existing methods for controllable summarization fail to generate entity-centric summaries. We propose extensions to state-of-the-art summarization approaches that achieve substantially better results on our data set. Our analysis and results show the challenging nature of this task and of the proposed data set.