Using Apache Lucene to Search Vector of Locally Aggregated Descriptors
This is an incremental improvement for computer vision researchers and practitioners needing faster retrieval of visual descriptors using text search engines.
The paper tackled the problem of efficient similarity search for Vector of Locally Aggregated Descriptors (VLAD) by extending Surrogate Text Representation (STR) to eliminate the need for reordering results, achieving performance near to that of original VLAD vectors on a public dataset.
Surrogate Text Representation (STR) is a profitable solution to efficient similarity search on metric space using conventional text search engines, such as Apache Lucene. This technique is based on comparing the permutations of some reference objects in place of the original metric distance. However, the Achilles heel of STR approach is the need to reorder the result set of the search according to the metric distance. This forces to use a support database to store the original objects, which requires efficient random I/O on a fast secondary memory (such as flash-based storages). In this paper, we propose to extend the Surrogate Text Representation to specifically address a class of visual metric objects known as Vector of Locally Aggregated Descriptors (VLAD). This approach is based on representing the individual sub-vectors forming the VLAD vector with the STR, providing a finer representation of the vector and enabling us to get rid of the reordering phase. The experiments on a publicly available dataset show that the extended STR outperforms the baseline STR achieving satisfactory performance near to the one obtained with the original VLAD vectors.