Fast and Exact Nearest Neighbor Search in Hamming Space on Full-Text Search Engines
This work addresses the need for efficient nearest neighbor search in binary codes for applications using full-text search engines, but it is incremental as it builds on known techniques.
The paper tackled the problem of enabling fast and exact nearest neighbor search in Hamming space on full-text search engines by developing an engineering solution based on existing techniques, resulting in significant speed-ups over the state-of-the-art term match approach.
A growing interest has been witnessed recently from both academia and industry in building nearest neighbor search (NNS) solutions on top of full-text search engines. Compared with other NNS systems, such solutions are capable of effectively reducing main memory consumption, coherently supporting multi-model search and being immediately ready for production deployment. In this paper, we continue the journey to explore specifically how to empower full-text search engines with fast and exact NNS in Hamming space (i.e., the set of binary codes). By revisiting three techniques (bit operation, subs-code filtering and data preprocessing with permutation) in information retrieval literature, we develop a novel engineering solution for full-text search engines to efficiently accomplish this special but important NNS task. In the experiment, we show that our proposed approach enables full-text search engines to achieve significant speed-ups over its state-of-the-art term match approach for NNS within binary codes.