DSMay 1

Sparse Neighborhood Graph-Based Approximate Nearest Neighbor Search Revisited: Theoretical Analysis and Optimization

arXiv:2509.1553116.3h-index: 2
Predicted impact top 44% in DS · last 90 daysOriginality Incremental advance
AI Analysis

For practitioners of large-scale vector search, this work eliminates expensive parameter tuning in SNG construction, making it more efficient and theoretically grounded.

The paper introduces OPT-SNG, a framework for analyzing and optimizing Sparse Neighborhood Graph (SNG) construction for approximate nearest neighbor search. It proves theoretical bounds on graph properties and derives a closed-form rule for the truncation parameter, achieving an average 5.9× speedup in index construction time (up to 15.4×) while maintaining search performance.

Graph-based approaches to approximate nearest neighbor search (ANNS) enable fast, high-recall retrieval on billion-scale vector datasets. Among them, the Sparse Neighborhood Graph (SNG) is widely used due to its strong search performance. However, the lack of theoretical understanding of SNG leads to expensive tuning of the truncation parameter that controls graph sparsification. In this work, we present OPT-SNG, a principled framework for analyzing and optimizing SNG construction. We introduce a martingale-based model of the pruning process that characterizes the stochastic evolution of candidate sets during graph construction. Using this framework, we prove that SNG has a maximum out-degree of \(O(n^{2/3+ε})\), where \(ε>0\) is an arbitrarily small constant, and an expected search path length of \(O(\log n)\). Building on these insights, we derive a closed-form rule for selecting the optimal truncation parameter \(R\), thereby eliminating the need for costly parameter sweeping. Extensive experiments on real-world datasets demonstrate that OPT-SNG achieves an average \(5.9\times\) speedup in index construction time, with peak improvements reaching \(15.4\times\), while consistently maintaining or improving search performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes