OSDBMar 23

GateANN: I/O-Efficient Filtered Vector Search on SSDs

arXiv:2603.214665.4h-index: 1
Predicted impact top 85% in OS · last 90 daysOriginality Incremental advance
AI Analysis

This addresses the bottleneck of I/O waste and expensive rebuilds in large-scale vector databases for applications like recommendation systems, though it is incremental as it builds on existing graph ANNS methods.

The paper tackled the problem of I/O inefficiency in SSD-based graph approximate nearest neighbor search (ANNS) systems for filtered vector search, achieving up to a 10x reduction in SSD reads and a 7.6x improvement in throughput.

We present GateANN, an I/O-efficient SSD-based graph ANNS system that supports filtered vector search on an unmodified graph index. Existing SSD-based systems either waste I/O by post-filtering, or require expensive filter-aware index rebuilds. GateANN avoids both by decoupling graph traversal from vector retrieval. Our key insight is that traversing a node requires only its neighbor list and an approximate distance, neither of which needs the full-precision vector on SSD. Based on this, GateANN introduces graph tunneling. It checks each node's filter predicate in memory before issuing I/O and routes through non-matching nodes entirely in memory, preserving graph connectivity without any SSD read for non-matching nodes. Our experimental results show that it reduces SSD reads by up to 10x and improves throughput by up to 7.6x.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes