CL IROct 27, 2025

LimRank: Less is More for Reasoning-Intensive Information Reranking

Tingyu Song, Yilun Zhao, Siyue Zhang, Chen Zhao, Arman Cohan

arXiv:2510.23544v12 citationsh-index: 22Has CodeEMNLP

Originality Incremental advance

AI Analysis

This work addresses the computational cost issue for researchers and practitioners in information retrieval, though it is incremental as it builds on existing LLM adaptation methods.

The paper tackles the problem of computationally expensive fine-tuning for LLMs in information reranking by proposing LIMRANK, a model adapted with minimal supervision using synthetic data, achieving competitive performance on benchmarks like BRIGHT and FollowIR while using less than 5% of typical training data.

Existing approaches typically rely on large-scale fine-tuning to adapt LLMs for information reranking tasks, which is computationally expensive. In this work, we demonstrate that modern LLMs can be effectively adapted using only minimal, high-quality supervision. To enable this, we design LIMRANK-SYNTHESIZER, a reusable and open-source pipeline for generating diverse, challenging, and realistic reranking examples. Using this synthetic data, we fine-tune our reranker model, LIMRANK. We evaluate LIMRANK on two challenging benchmarks, i.e., BRIGHT for reasoning-intensive retrieval and FollowIR for instruction-following retrieval. Our experiments demonstrate that LIMRANK achieves competitive performance, while being trained on less than 5% of the data typically used in prior work. Further ablation studies demonstrate the effectiveness of LIMRANK-SYNTHESIZER and the strong generalization capabilities of LIMRANK across downstream tasks, including scientific literature search and retrieval-augmented generation for knowledge-intensive problem solving.

View on arXiv PDF

Similar