ML DS LGSep 23, 2015

Fast k-NN search

Ville Hyvönen, Teemu Pitkänen, Sotiris Tasoulis, Elias Jääsaari, Risto Tuomainen, Liang Wang, Jukka Corander, Teemu Roos

arXiv:1509.06957v26.112 citationsHas Code

Originality Incremental advance

AI Analysis

This incremental improvement addresses efficiency issues in recommendation systems and similar applications by reducing memory usage and speeding up queries.

The paper tackles the problem of slow and memory-intensive approximate nearest neighbor search in high-dimensional spaces by combining multiple random projection trees with a novel voting scheme, resulting in a method that is faster than existing approaches at high accuracy levels.

Efficient index structures for fast approximate nearest neighbor queries are required in many applications such as recommendation systems. In high-dimensional spaces, many conventional methods suffer from excessive usage of memory and slow response times. We propose a method where multiple random projection trees are combined by a novel voting scheme. The key idea is to exploit the redundancy in a large number of candidate sets obtained by independently generated random projections in order to reduce the number of expensive exact distance evaluations. The method is straightforward to implement using sparse projections which leads to a reduced memory footprint and fast index construction. Furthermore, it enables grouping of the required computations into big matrix multiplications, which leads to additional savings due to cache effects and low-level parallelization. We demonstrate by extensive experiments on a wide variety of data sets that the method is faster than existing partitioning tree or hashing based approaches, making it the fastest available technique on high accuracy levels.

View on arXiv PDF Code

Similar