LGCLMLJul 21, 2015

Clustering is Efficient for Approximate Maximum Inner Product Search

arXiv:1507.05910v317 citations
AI Analysis

This work addresses the need for faster and more robust approximate MIPS in applications like recommendation systems and large-scale classification, offering an incremental improvement over existing methods.

The paper tackles the problem of efficient Maximum Inner Product Search (MIPS) by proposing a simple approach based on spherical k-means clustering, which achieves much higher speedups for the same retrieval precision compared to state-of-the-art hashing-based and tree-based methods on standard benchmarks.

Efficient Maximum Inner Product Search (MIPS) is an important task that has a wide applicability in recommendation systems and classification with a large number of classes. Solutions based on locality-sensitive hashing (LSH) as well as tree-based solutions have been investigated in the recent literature, to perform approximate MIPS in sublinear time. In this paper, we compare these to another extremely simple approach for solving approximate MIPS, based on variants of the k-means clustering algorithm. Specifically, we propose to train a spherical k-means, after having reduced the MIPS problem to a Maximum Cosine Similarity Search (MCSS). Experiments on two standard recommendation system benchmarks as well as on large vocabulary word embeddings, show that this simple approach yields much higher speedups, for the same retrieval precision, than current state-of-the-art hashing-based and tree-based methods. This simple method also yields more robust retrievals when the query is corrupted by noise.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes