IRCLFeb 13, 2023

Improving Out-of-Distribution Generalization of Neural Rerankers with Contextualized Late Interaction

arXiv:2302.06589v11 citationsh-index: 87
Originality Incremental advance
AI Analysis

This work addresses the challenge of robust information retrieval for users in cross-domain settings, though it is incremental as it builds on existing multi-vector methods.

The paper tackled the problem of improving out-of-distribution generalization for neural rerankers by adding late interaction, resulting in an extra 5% average improvement on out-of-distribution datasets with minimal latency increase and no in-domain degradation.

Recent progress in information retrieval finds that embedding query and document representation into multi-vector yields a robust bi-encoder retriever on out-of-distribution datasets. In this paper, we explore whether late interaction, the simplest form of multi-vector, is also helpful to neural rerankers that only use the [CLS] vector to compute the similarity score. Although intuitively, the attention mechanism of rerankers at the previous layers already gathers the token-level information, we find adding late interaction still brings an extra 5% improvement in average on out-of-distribution datasets, with little increase in latency and no degradation in in-domain effectiveness. Through extensive experiments and analysis, we show that the finding is consistent across different model sizes and first-stage retrievers of diverse natures and that the improvement is more prominent on longer queries.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes