IRCLLGOct 20, 2020

CoRT: Complementary Rankings from Transformers

arXiv:2010.10252v2729 citations
Originality Incremental advance
AI Analysis

This work addresses retrieval efficiency and recall for information retrieval systems, but it is incremental as it builds on existing multi-stage ranking pipelines and pretrained models.

The paper tackles the problem of missing relevant passages in first-stage retrieval by proposing CoRT, a neural model that complements BM25 using contextual representations from pretrained language models, resulting in significantly increased candidate recall and superior re-ranking results with less candidates on the MS MARCO dataset.

Many recent approaches towards neural information retrieval mitigate their computational costs by using a multi-stage ranking pipeline. In the first stage, a number of potentially relevant candidates are retrieved using an efficient retrieval model such as BM25. Although BM25 has proven decent performance as a first-stage ranker, it tends to miss relevant passages. In this context we propose CoRT, a simple neural first-stage ranking model that leverages contextual representations from pretrained language models such as BERT to complement term-based ranking functions while causing no significant delay at query time. Using the MS MARCO dataset, we show that CoRT significantly increases the candidate recall by complementing BM25 with missing candidates. Consequently, we find subsequent re-rankers achieve superior results with less candidates. We further demonstrate that passage retrieval using CoRT can be realized with surprisingly low latencies.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes