IRJul 27, 2021

Understanding and Generalizing Monotonic Proximity Graphs for Approximate Nearest Neighbor Search

arXiv:2107.13052v11 citations
Originality Incremental advance
AI Analysis

This work addresses a fundamental gap in theory for practitioners in machine learning and data retrieval, though it is incremental as it builds on existing graph-based methods.

The paper tackles the lack of theoretical understanding in graph-based approximate nearest neighbor search by analyzing the Monotonic Relative Neighborhood Graph model, using mathematical proofs to explain its good performance and experiments to guide its generalization for large-scale use.

Graph-based algorithms have shown great empirical potential for the approximate nearest neighbor (ANN) search problem. Currently, graph-based ANN search algorithms are designed mainly using heuristics, whereas theoretical analysis of such algorithms is quite lacking. In this paper, we study a fundamental model of proximity graphs used in graph-based ANN search, called Monotonic Relative Neighborhood Graph (MRNG), from a theoretical perspective. We use mathematical proofs to explain why proximity graphs that are built based on MRNG tend to have good searching performance. We also run experiments on MRNG and graphs generalizing MRNG to obtain a deeper understanding of the model. Our experiments give guidance on how to approximate and generalize MRNG to build proximity graphs on a large scale. In addition, we discover and study a hidden structure of MRNG called conflicting nodes, and we give theoretical evidence how conflicting nodes could be used to improve ANN search methods that are based on MRNG.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes