CVIRLGJun 5, 2025

LotusFilter: Fast Diverse Nearest Neighbor Search via a Learned Cutoff Table

arXiv:2506.04790v13 citationsh-index: 1Has CodeCVPR
Originality Synthesis-oriented
AI Analysis

This addresses the need for diverse search results in applications like RAG, though it is incremental as it builds on existing ANNS methods.

The paper tackles the problem of overly similar results in approximate nearest neighbor search (ANNS) by proposing LotusFilter, a post-processing module that diversifies ANNS results, achieving fast operation at 0.02 ms per query in real-world RAG applications.

Approximate nearest neighbor search (ANNS) is an essential building block for applications like RAG but can sometimes yield results that are overly similar to each other. In certain scenarios, search results should be similar to the query and yet diverse. We propose LotusFilter, a post-processing module to diversify ANNS results. We precompute a cutoff table summarizing vectors that are close to each other. During the filtering, LotusFilter greedily looks up the table to delete redundant vectors from the candidates. We demonstrated that the LotusFilter operates fast (0.02 [ms/query]) in settings resembling real-world RAG applications, utilizing features such as OpenAI embeddings. Our code is publicly available at https://github.com/matsui528/lotf.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes