CVApr 21, 2017

PQTable: Non-exhaustive Fast Search for Product-quantized Codes using Hash Tables

Yusuke Matsui, Toshihiko Yamasaki, Kiyoharu Aizawa

arXiv:1704.06556v11.712 citations

Originality Highly original

AI Analysis

This provides a practical and efficient solution for real-world large-scale similarity search problems, particularly in domains like image retrieval, by eliminating manual parameter tuning and training requirements compared to previous methods.

The paper tackles the problem of fast nearest neighbor search for product-quantized codes by proposing PQTable, a hash-table-based method that achieves the same results as linear scanning but is 10^2 to 10^5 times faster, with performance as low as 0.059 ms per query over 10^9 data points using 5.5 GB memory.

In this paper, we propose a product quantization table (PQTable); a fast search method for product-quantized codes via hash-tables. An identifier of each database vector is associated with the slot of a hash table by using its PQ-code as a key. For querying, an input vector is PQ-encoded and hashed, and the items associated with that code are then retrieved. The proposed PQTable produces the same results as a linear PQ scan, and is 10^2 to 10^5 times faster. Although state-of-the-art performance can be achieved by previous inverted-indexing-based approaches, such methods require manually-designed parameter setting and significant training; our PQTable is free of these limitations, and therefore offers a practical and effective solution for real-world problems. Specifically, when the vectors are highly compressed, our PQTable achieves one of the fastest search performances on a single CPU to date with significantly efficient memory usage (0.059 ms per query over 10^9 data points with just 5.5 GB memory consumption). Finally, we show that our proposed PQTable can naturally handle the codes of an optimized product quantization (OPQTable).

View on arXiv PDF

Similar