CLSep 14, 2023

Less is More for Long Document Summary Evaluation by LLMs

NVIDIA
arXiv:2309.07382v2118 citationsh-index: 20
Originality Incremental advance
AI Analysis

This addresses cost and accuracy issues for researchers and practitioners using LLMs to evaluate summaries of long documents, representing an incremental improvement.

The paper tackles the high computational costs and Lost-in-the-Middle problem in LLM-based summary evaluation for long documents by proposing an Extract-then-Evaluate method, which reduces costs and achieves a higher correlation with human evaluations.

Large Language Models (LLMs) have shown promising performance in summary evaluation tasks, yet they face challenges such as high computational costs and the Lost-in-the-Middle problem where important information in the middle of long documents is often overlooked. To address these issues, this paper introduces a novel approach, Extract-then-Evaluate, which involves extracting key sentences from a long source document and then evaluating the summary by prompting LLMs. The results reveal that the proposed method not only significantly reduces evaluation costs but also exhibits a higher correlation with human evaluations. Furthermore, we provide practical recommendations for optimal document length and sentence extraction methods, contributing to the development of cost-effective yet more accurate methods for LLM-based text generation evaluation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes