Aspect-Based Summarization with Self-Aspect Retrieval Enhanced Generation
This work addresses resource constraints and limited generalizability in summarization for users needing tailored summaries, but it is incremental as it builds on existing methods.
The paper tackles the problem of aspect-based summarization by addressing token limits and hallucination in large language models, proposing a framework that uses embedding-driven retrieval to identify relevant text segments, which achieves superior performance on benchmark datasets.
Aspect-based summarization aims to generate summaries tailored to specific aspects, addressing the resource constraints and limited generalizability of traditional summarization approaches. Recently, large language models have shown promise in this task without the need for training. However, they rely excessively on prompt engineering and face token limits and hallucination challenges, especially with in-context learning. To address these challenges, in this paper, we propose a novel framework for aspect-based summarization: Self-Aspect Retrieval Enhanced Summary Generation. Rather than relying solely on in-context learning, given an aspect, we employ an embedding-driven retrieval mechanism to identify its relevant text segments. This approach extracts the pertinent content while avoiding unnecessary details, thereby mitigating the challenge of token limits. Moreover, our framework optimizes token usage by deleting unrelated parts of the text and ensuring that the model generates output strictly based on the given aspect. With extensive experiments on benchmark datasets, we demonstrate that our framework not only achieves superior performance but also effectively mitigates the token limitation problem.