IRCLNov 18, 2022

CITADEL: Conditional Token Interaction via Dynamic Lexical Routing for Efficient and Effective Multi-Vector Retrieval

Meta AI
arXiv:2211.10411v1229 citationsh-index: 87Has Code
Originality Highly original
AI Analysis

This addresses the efficiency bottleneck in multi-vector retrieval for information retrieval systems, offering a practical solution with significant speed gains.

The paper tackles the inefficiency of multi-vector retrieval methods, which are slow and space-intensive, by proposing CITADEL, a model that uses dynamic lexical routing to reduce computation while maintaining high accuracy. It achieves similar or better performance than the state-of-the-art ColBERT-v2 on MS MARCO and BEIR benchmarks, with a 40x speed improvement.

Multi-vector retrieval methods combine the merits of sparse (e.g. BM25) and dense (e.g. DPR) retrievers and have achieved state-of-the-art performance on various retrieval tasks. These methods, however, are orders of magnitude slower and need much more space to store their indices compared to their single-vector counterparts. In this paper, we unify different multi-vector retrieval models from a token routing viewpoint and propose conditional token interaction via dynamic lexical routing, namely CITADEL, for efficient and effective multi-vector retrieval. CITADEL learns to route different token vectors to the predicted lexical ``keys'' such that a query token vector only interacts with document token vectors routed to the same key. This design significantly reduces the computation cost while maintaining high accuracy. Notably, CITADEL achieves the same or slightly better performance than the previous state of the art, ColBERT-v2, on both in-domain (MS MARCO) and out-of-domain (BEIR) evaluations, while being nearly 40 times faster. Code and data are available at https://github.com/facebookresearch/dpr-scale.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes