Optimizing Ranking Measures for Compact Binary Code Learning
This addresses the need for more effective hashing techniques in information retrieval, though it is incremental as it builds on existing structured output learning methods.
The authors tackled the problem of optimizing multivariate performance measures like AUC and NDCG in hashing for large-scale information retrieval, resulting in a framework called StructHash that outperforms state-of-the-art methods in ranking prediction and image retrieval.
Hashing has proven a valuable tool for large-scale information retrieval. Despite much success, existing hashing methods optimize over simple objectives such as the reconstruction error or graph Laplacian related loss functions, instead of the performance evaluation criteria of interest---multivariate performance measures such as the AUC and NDCG. Here we present a general framework (termed StructHash) that allows one to directly optimize multivariate performance measures. The resulting optimization problem can involve exponentially or infinitely many variables and constraints, which is more challenging than standard structured output learning. To solve the StructHash optimization problem, we use a combination of column generation and cutting-plane techniques. We demonstrate the generality of StructHash by applying it to ranking prediction and image retrieval, and show that it outperforms a few state-of-the-art hashing methods.