IRAICLLGAug 9, 2025

ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability

arXiv:2508.07050v238 citationsh-index: 27Has Code
Originality Incremental advance
AI Analysis

This work addresses the bottleneck of training data scarcity for reasoning-intensive passage ranking, which is important for improving search and retrieval systems, though it appears incremental as it builds on existing LLM-based listwise ranking methods.

The paper tackles the problem of poor performance in complex passage ranking scenarios due to scarce reasoning-intensive training data by proposing ReasonRank, a two-stage post-training approach that synthesizes high-quality training data and uses multi-view ranking rewards. The result is a reasoning-intensive reranker that achieves state-of-the-art performance of 40.6 on the BRIGHT leaderboard with lower latency than baselines.

Large Language Model (LLM) based listwise ranking has shown superior performance in many passage ranking tasks. With the development of Large Reasoning Models, many studies have demonstrated that step-by-step reasoning during test-time helps improve listwise ranking performance. However, due to the scarcity of reasoning-intensive training data, existing rerankers perform poorly in many complex ranking scenarios and the ranking ability of reasoning-intensive rerankers remains largely underdeveloped. In this paper, we first propose an automated reasoning-intensive training data synthesis framework, which sources training queries and passages from diverse domains and applies DeepSeek-R1 to generate high-quality training labels. A self-consistency data filtering mechanism is designed to ensure the data quality. To empower the listwise reranker with strong reasoning ability, we further propose a two-stage post-training approach, which includes a cold-start supervised fine-tuning (SFT) stage for reasoning pattern learning and a reinforcement learning (RL) stage for further ranking ability enhancement. During the RL stage, based on the nature of listwise ranking, we design a multi-view ranking reward, which is more effective than a ranking metric-based reward. Extensive experiments demonstrate that our trained reasoning-intensive reranker \textbf{ReasonRank} outperforms existing baselines significantly and also achieves much lower latency than pointwise reranker Rank1. \textbf{Through further experiments, our ReasonRank has achieved state-of-the-art (SOTA) performance 40.6 on the BRIGHT leaderboard\footnote{https://brightbenchmark.github.io/}.} Our codes are available at https://github.com/8421BCD/ReasonRank.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes