CLJul 10, 2025

An Automated Length-Aware Quality Metric for Summarization

arXiv:2507.07653v11 citationsh-index: 1

Originality Incremental advance

AI Analysis

It provides an automated tool for assessing summarization in various tasks, reducing reliance on human reference summaries.

The paper tackles the problem of evaluating summarization quality by proposing NOIR, a metric that balances semantic retention and length compression, and shows it correlates with human judgments.

This paper proposes NOrmed Index of Retention (NOIR), a quantitative objective metric for evaluating summarization quality of arbitrary texts that relies on both the retention of semantic meaning and the summary length compression. This gives a measure of how well the recall-compression tradeoff is managed, the most important skill in summarization. Experiments demonstrate that NOIR effectively captures the token-length / semantic retention tradeoff of a summarizer and correlates to human perception of sumarization quality. Using a language model-embedding to measure semantic similarity, it provides an automated alternative for assessing summarization quality without relying on time-consuming human-generated reference summaries. The proposed metric can be applied to various summarization tasks, offering an automated tool for evaluating and improving summarization algorithms, summarization prompts, and synthetically-generated summaries.

View on arXiv PDF

Similar