IR DBOct 17, 2021

Low-Precision Quantization for Efficient Nearest Neighbor Search

Anthony Ko, Iman Keivanloo, Vihan Lakshman, Eric Schkufza

arXiv:2110.08919v19.59 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses memory and speed bottlenecks in information retrieval and recommendation systems, offering an incremental improvement that can be combined with existing algorithms.

The paper tackles the problem of efficient nearest neighbor search by introducing a low-precision quantization method that preserves distance metric behavior, resulting in a 60% memory reduction, 30% query throughput improvement, and only a 2% recall reduction.

Fast k-Nearest Neighbor search over real-valued vector spaces (KNN) is an important algorithmic task for information retrieval and recommendation systems. We present a method for using reduced precision to represent vectors through quantized integer values, enabling both a reduction in the memory overhead of indexing these vectors and faster distance computations at query time. While most traditional quantization techniques focus on minimizing the reconstruction error between a point and its uncompressed counterpart, we focus instead on preserving the behavior of the underlying distance metric. Furthermore, our quantization approach is applied at the implementation level and can be combined with existing KNN algorithms. Our experiments on both open source and proprietary datasets across multiple popular KNN frameworks validate that quantized distance metrics can reduce memory by 60% and improve query throughput by 30%, while incurring only a 2% reduction in recall.

View on arXiv PDF

Similar