IRJan 21, 2021

Rethink Training of BERT Rerankers in Multi-Stage Retrieval Pipeline

arXiv:2101.08751v1151 citations
AI Analysis

This addresses an incremental issue for researchers and practitioners in text retrieval, improving pipeline efficiency.

The paper tackles the problem that popular rerankers in multi-stage retrieval pipelines fail to fully exploit improved retrieval results, and proposes Localized Contrastive Estimation (LCE) to train rerankers, demonstrating significant improvements in deep two-stage models.

Pre-trained deep language models~(LM) have advanced the state-of-the-art of text retrieval. Rerankers fine-tuned from deep LM estimates candidate relevance based on rich contextualized matching signals. Meanwhile, deep LMs can also be leveraged to improve search index, building retrievers with better recall. One would expect a straightforward combination of both in a pipeline to have additive performance gain. In this paper, we discover otherwise and that popular reranker cannot fully exploit the improved retrieval result. We, therefore, propose a Localized Contrastive Estimation (LCE) for training rerankers and demonstrate it significantly improves deep two-stage models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes