IRJun 16, 2014
Eclipse Hashing: Alexandrov Compactification and Hashing with Hyperspheres for Fast Similarity SearchYui Noma, Makiko Konoshima
The similarity searches that use high-dimensional feature vectors consisting of a vast amount of data have a wide range of application. One way of conducting a fast similarity search is to transform the feature vectors into binary vectors and perform the similarity search by using the Hamming distance. Such a transformation is a hashing method, and the choice of hashing function is important. Hashing methods using hyperplanes or hyperspheres are proposed. One study reported here is inspired by Spherical LSH, and we use hypersperes to hash the feature vectors. Our method, called Eclipse-hashing, performs a compactification of R^n by using the inverse stereographic projection, which is a kind of Alexandrov compactification. By using Eclipse-hashing, one can obtain the hypersphere-hash function without explicitly using hyperspheres. Hence, the number of nonlinear operations is reduced and the processing time of hashing becomes shorter. Furthermore, we also show that as a result of improving the approximation accuracy, Eclipse-hashing is more accurate than hyperplane-hashing.
LGMar 18, 2013
Markov Chain Monte Carlo for Arrangement of Hyperplanes in Locality-Sensitive HashingYui Noma, Makiko Konoshima
Since Hamming distances can be calculated by bitwise computations, they can be calculated with less computational load than L2 distances. Similarity searches can therefore be performed faster in Hamming distance space. The elements of Hamming distance space are bit strings. On the other hand, the arrangement of hyperplanes induce the transformation from the feature vectors into feature bit strings. This transformation method is a type of locality-sensitive hashing that has been attracting attention as a way of performing approximate similarity searches at high speed. Supervised learning of hyperplane arrangements allows us to obtain a method that transforms them into feature bit strings reflecting the information of labels applied to higher-dimensional feature vectors. In this p aper, we propose a supervised learning method for hyperplane arrangements in feature space that uses a Markov chain Monte Carlo (MCMC) method. We consider the probability density functions used during learning, and evaluate their performance. We also consider the sampling method for learning data pairs needed in learning, and we evaluate its performance. We confirm that the accuracy of this learning method when using a suitable probability density function and sampling method is greater than the accuracy of existing learning methods.
LGSep 26, 2012
Locality-Sensitive Hashing with Margin Based Feature SelectionMakiko Konoshima, Yui Noma
We propose a learning method with feature selection for Locality-Sensitive Hashing. Locality-Sensitive Hashing converts feature vectors into bit arrays. These bit arrays can be used to perform similarity searches and personal authentication. The proposed method uses bit arrays longer than those used in the end for similarity and other searches and by learning selects the bits that will be used. We demonstrated this method can effectively perform optimization for cases such as fingerprint images with a large number of labels and extremely few data that share the same labels, as well as verifying that it is also effective for natural images, handwritten digits, and speech features.