IR CLFeb 13, 2023

Improving Out-of-Distribution Generalization of Neural Rerankers with Contextualized Late Interaction

arXiv:2302.06589v13.51 citationsh-index: 87

Originality Incremental advance

AI Analysis

This work addresses the challenge of robust information retrieval for users in cross-domain settings, though it is incremental as it builds on existing multi-vector methods.

The paper tackled the problem of improving out-of-distribution generalization for neural rerankers by adding late interaction, resulting in an extra 5% average improvement on out-of-distribution datasets with minimal latency increase and no in-domain degradation.

Recent progress in information retrieval finds that embedding query and document representation into multi-vector yields a robust bi-encoder retriever on out-of-distribution datasets. In this paper, we explore whether late interaction, the simplest form of multi-vector, is also helpful to neural rerankers that only use the [CLS] vector to compute the similarity score. Although intuitively, the attention mechanism of rerankers at the previous layers already gathers the token-level information, we find adding late interaction still brings an extra 5% improvement in average on out-of-distribution datasets, with little increase in latency and no degradation in in-domain effectiveness. Through extensive experiments and analysis, we show that the finding is consistent across different model sizes and first-stage retrievers of diverse natures and that the improvement is more prominent on longer queries.

View on arXiv PDF

Similar