OSDBMay 18

PipeANN-Filter: An Efficient Filtered Vector Search System on SSD

arXiv:2605.1799215.1Has Code
AI Analysis

For practitioners needing efficient filtered vector search on SSD, this work offers a practical I/O optimization.

PipeANN-Filter improves filtered vector search on SSD by exploring a superset of valid vectors using probabilistic data structures, reducing SSD I/O. It achieves better search latency and throughput than state-of-the-art systems.

We propose PipeANN-Filter, an efficient filtered vector search system on SSD. Unlike existing systems that explore only valid vectors (i.e., those satisfying the attribute constraints) during search, PipeANN-Filter explores a superset of valid vectors, and performs attribute verification after getting the top-k closest result vectors. This allows PipeANN-Filter to leverage probabilistic data structures (e.g., Bloom filters) to identify the superset, trading off a small number of false-positive vector explorations for a massive reduction in SSD I/O for attribute reading. Evaluations show that PipeANN-Filter improves search latency and throughput compared to state-of-the-art systems. PipeANN-Filter is open-source at https://github.com/thustorage/PipeANN

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes