IRJun 2

EviRerank: Adaptive Evidence Construction for Long-Document LLM Reranking

arXiv:2411.0625462.21 citations
Predicted impact top 51% in IR · last 90 daysOriginality Incremental advance
AI Analysis

For practitioners using decoder-only LLMs for long-document retrieval, EviRerank offers a practical method to improve efficiency and effectiveness, though it is an incremental improvement over existing block-selection approaches.

EviRerank addresses the challenge of long-document reranking with decoder-only LLMs by constructing a compact evidence context via adaptive block selection and summary augmentation. On TREC DL'19, it achieves 0.744 nDCG@10 and 0.307 MAP, outperforming full-document LLM reranking while reducing input length.

Decoder-only LLM rerankers struggle with long documents: inference is costly and relevance signals can be diluted by irrelevant context. Motivated by a diagnostic attention analysis suggesting that appended irrelevant context can weaken query-focused interactions, we propose EviRerank, an evidence-based long-document reranking framework for decoder-only LLMs. EviRerank first scores document blocks with a lightweight selector, such as BM25, a bi-encoder, or a cross-encoder. It then constructs a compact reranking context under a hard token cap by dynamically budgeting evidence blocks with Adaptive Evidence Budgeting (AEB) and adding a compact global cue via Summary Augmentation (SA). Finally, the compact evidence context is reranked with a decoder-only LLM. Across TREC DL'19, DL'22, DL'23, and MLDR-zh, EviRerank consistently outperforms full-document LLM reranking and strong block-selection baselines while reducing input length. RankZephyr-7B validation further confirms transfer to listwise reranking. On TREC DL'19, EviRerank reaches up to 0.744 nDCG@10 and 0.307 MAP, improving over RankLLaMA while using a compact evidence context.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes