Efficient Similarity Indexing and Searching in High Dimensions
This addresses the challenge of the curse of dimensionality for researchers and practitioners handling high-dimensional data, though it appears incremental as it builds on existing indexing methods.
The paper tackles the problem of efficient similarity indexing and searching in high-dimensional data, presenting a new approach using random partitions that shows high effectiveness and efficiency on datasets of several hundred dimensions, with comparisons to state-of-the-art methods.
Efficient indexing and searching of high dimensional data has been an area of active research due to the growing exploitation of high dimensional data and the vulnerability of traditional search methods to the curse of dimensionality. This paper presents a new approach for fast and effective searching and indexing of high dimensional features using random partitions of the feature space. Experiments on both handwritten digits and 3-D shape descriptors have shown the proposed algorithm to be highly effective and efficient in indexing and searching real data sets of several hundred dimensions. We also compare its performance to that of the state-of-the-art locality sensitive hashing algorithm.