CLAug 8, 2025

Beyond Perplexity: Let the Reader Select Retrieval Summaries via Spectrum Projection Score

arXiv:2508.05909v2h-index: 11Has Code
Originality Incremental advance
AI Analysis

This work addresses the problem of evaluating and optimizing retrieval in RAG systems for researchers and practitioners, offering a more principled approach to improve generation quality.

The paper tackles the challenge of isolating the contribution of retrieval in retrieval-augmented generation by introducing Spectrum Projection Score (SPS), a metric that measures semantic alignment between retrieved summaries and hidden representations, and xCompress, a framework for dynamic summary selection; experiments on five QA benchmarks with four LLMs show enhanced performance.

Large Language Models (LLMs) have shown improved generation performance through retrieval-augmented generation (RAG) following the retriever-reader paradigm, which supplements model inputs with externally retrieved knowledge. However, prior work often evaluates RAG holistically, assessing the retriever and reader jointly, making it difficult to isolate the true contribution of retrieval, particularly given the prompt sensitivity of LLMs used as readers. We move beyond perplexity and introduce Spectrum Projection Score (SPS), a lightweight and supervision-free metric that allows the reader to gauge the semantic alignment of a retrieved summary with its hidden representation by comparing the area formed by generated tokens from the summary, and the principal directions of subspace in the reader and to measure the relevance. Building on SPS we present xCompress, an inference-time controller framework that dynamically samples, ranks, and compresses retrieval summary candidates. Extensive experiments on five QA benchmarks with four open-sourced LLMs show that SPS not only enhances performance across a range of tasks but also provides a principled perspective on the interaction between retrieval and generation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes