CV LGDec 17, 2021

Nearest neighbor search with compact codes: A decoder perspective

Kenza Amara, Matthijs Douze, Alexandre Sablayrolles, Hervé Jégou

arXiv:2112.09568v25.66 citations

Originality Incremental advance

AI Analysis

This work addresses faster retrieval in billion-scale datasets, but appears incremental as it builds on existing compression methods.

The paper tackles the problem of improving nearest neighbor search performance by reinterpreting existing compression methods as auto-encoders and designing better decoders. It reports significant improvements over binary hashing and product quantization on popular benchmarks.

Modern approaches for fast retrieval of similar vectors on billion-scaled datasets rely on compressed-domain approaches such as binary sketches or product quantization. These methods minimize a certain loss, typically the mean squared error or other objective functions tailored to the retrieval problem. In this paper, we re-interpret popular methods such as binary hashing or product quantizers as auto-encoders, and point out that they implicitly make suboptimal assumptions on the form of the decoder. We design backward-compatible decoders that improve the reconstruction of the vectors from the same codes, which translates to a better performance in nearest neighbor search. Our method significantly improves over binary hashing methods or product quantization on popular benchmarks.

View on arXiv PDF

Similar