CLIRDec 14, 2021

Boosted Dense Retriever

arXiv:2112.07771v1629 citations
Originality Incremental advance
AI Analysis

This work addresses efficiency challenges in dense retrieval for deployment, making it cheaper and faster, though it is incremental as it builds on existing dense retrieval methods.

The paper tackles the problem of dense retrieval efficiency by proposing DrBoost, a boosting-inspired ensemble method that produces 4x more compact representations while maintaining comparable retrieval performance and further reducing latency and bandwidth by 4x under approximate search.

We propose DrBoost, a dense retrieval ensemble inspired by boosting. DrBoost is trained in stages: each component model is learned sequentially and specialized by focusing only on retrieval mistakes made by the current ensemble. The final representation is the concatenation of the output vectors of all the component models, making it a drop-in replacement for standard dense retrievers at test time. DrBoost enjoys several advantages compared to standard dense retrieval models. It produces representations which are 4x more compact, while delivering comparable retrieval results. It also performs surprisingly well under approximate search with coarse quantization, reducing latency and bandwidth needs by another 4x. In practice, this can make the difference between serving indices from disk versus from memory, paving the way for much cheaper deployments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes