CVSep 17, 2015

Improved Residual Vector Quantization for High-dimensional Approximate Nearest Neighbor Search

arXiv:1509.05195v17.012 citations

Originality Incremental advance

AI Analysis

This work addresses a domain-specific problem for researchers and practitioners in large-scale approximate nearest neighbor search, offering an incremental improvement over existing quantization methods.

The paper tackled the limitations of Residual Vector Quantization (RVQ) for high-dimensional approximate nearest neighbor search, where performance gain diminishes with added stages and encoding is NP-hard, by proposing an improved method (IRVQ) that uses subspace clustering and warm-started k-means for codebook learning and a multi-path encoding scheme, resulting in substantially improved performance over RVQ and better results compared to state-of-the-art methods on benchmark datasets.

Quantization methods have been introduced to perform large scale approximate nearest search tasks. Residual Vector Quantization (RVQ) is one of the effective quantization methods. RVQ uses a multi-stage codebook learning scheme to lower the quantization error stage by stage. However, there are two major limitations for RVQ when applied to on high-dimensional approximate nearest neighbor search: 1. The performance gain diminishes quickly with added stages. 2. Encoding a vector with RVQ is actually NP-hard. In this paper, we propose an improved residual vector quantization (IRVQ) method, our IRVQ learns codebook with a hybrid method of subspace clustering and warm-started k-means on each stage to prevent performance gain from dropping, and uses a multi-path encoding scheme to encode a vector with lower distortion. Experimental results on the benchmark datasets show that our method gives substantially improves RVQ and delivers better performance compared to the state-of-the-art.

View on arXiv PDF

Similar