Principled and Scalable Diversity-Aware Retrieval via Cardinality-Constrained Binary Quadratic Programming
For practitioners of retrieval-augmented generation, this provides a principled and scalable solution to diversity-aware retrieval.
The paper tackles the problem of diversity-aware retrieval for RAG, which lacks theoretical guarantees and scalability. The proposed CCBQP formulation balances relevance and diversity, and the algorithm achieves dominance on the Pareto frontier with significant speedup.
Diversity-aware retrieval is essential for Retrieval-Augmented Generation (RAG), yet existing methods lack theoretical guarantees and face scalability issues as the number of retrieved passages $k$ increases. We propose a principled formulation of diversity retrieval as a cardinality-constrained binary quadratic programming (CCBQP), which explicitly balances relevance and semantic diversity through an interpretable trade-off parameter. Inspired by recent advances in combinatorial optimization, we develop a non-convex tight continuous relaxation and a Frank--Wolfe based algorithm with landscape analysis and convergence guarantees. Extensive experiments demonstrate that our method consistently dominates baselines on the relevance-diversity Pareto frontier, while achieving significant speedup.