IRSep 22, 2015

Diverse Yet Efficient Retrieval using Hash Functions

Vidyadhar Rao, Prateek Jain, C. V Jawahar

arXiv:1509.06553v23.2

Originality Incremental advance

AI Analysis

This addresses the need for fast and effective retrieval systems in applications like image and multi-label prediction, though it appears incremental as it builds on existing hashing techniques.

The paper tackles the problem of achieving accurate, diverse, and efficient retrieval simultaneously, presenting a method based on randomized locality sensitive hashing that provides a trade-off between accuracy and diversity, resulting in a 100x speed-up over existing diverse retrieval approaches.

Typical retrieval systems have three requirements: a) Accurate retrieval i.e., the method should have high precision, b) Diverse retrieval, i.e., the obtained set of points should be diverse, c) Retrieval time should be small. However, most of the existing methods address only one or two of the above mentioned requirements. In this work, we present a method based on randomized locality sensitive hashing which tries to address all of the above requirements simultaneously. While earlier hashing approaches considered approximate retrieval to be acceptable only for the sake of efficiency, we argue that one can further exploit approximate retrieval to provide impressive trade-offs between accuracy and diversity. We extend our method to the problem of multi-label prediction, where the goal is to output a diverse and accurate set of labels for a given document in real-time. Moreover, we introduce a new notion to simultaneously evaluate a method's performance for both the precision and diversity measures. Finally, we present empirical results on several different retrieval tasks and show that our method retrieves diverse and accurate images/labels while ensuring $100x$-speed-up over the existing diverse retrieval approaches.

View on arXiv PDF

Similar