IRCLDSApr 9, 2024

AiSAQ: All-in-Storage ANNS with Product Quantization for DRAM-free Information Retrieval

arXiv:2404.06004v212 citationsh-index: 11Has Code
AI Analysis

This enables DRAM-free information retrieval for large-scale applications like retrieval-augmented generation (RAG), though it is incremental as it builds on DiskANN.

The paper tackles the problem of high memory usage in billion-scale approximate nearest neighbor search (ANNS) by proposing AiSAQ, which offloads compressed vectors to SSD storage, achieving ~10 MB memory usage without significant latency degradation.

Graph-based approximate nearest neighbor search (ANNS) algorithms work effectively against large-scale vector retrieval. Among such methods, DiskANN achieves good recall-speed tradeoffs using both DRAM and storage. DiskANN adopts product quantization (PQ) to reduce memory usage, which is still proportional to the scale of datasets. In this paper, we propose All-in-Storage ANNS with Product Quantization (AiSAQ), which offloads compressed vectors to the SSD index. Our method achieves $\sim$10 MB memory usage in query search with billion-scale datasets without critical latency degradation. AiSAQ also reduces the index load time for query search preparation, which enables fast switch between muitiple billion-scale indices.This method can be applied to retrievers of retrieval-augmented generation (RAG) and be scaled out with multiple-server systems for emerging datasets. Our DiskANN-based implementation is available on GitHub.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes