CVApr 27, 2016

Efficient Optimization for Rank-based Loss Functions

Pritish Mohapatra, Michal Rolinek, C. V. Jawahar, Vladimir Kolmogorov, M. Pawan Kumar

arXiv:1604.08269v314.540 citations

Originality Incremental advance

AI Analysis

This work addresses a computational bottleneck in training retrieval systems for vision tasks, offering an incremental improvement over existing methods.

The paper tackles the problem of efficiently optimizing non-differentiable and non-decomposable rank-based loss functions like AP and NDCG, which are crucial for information retrieval systems, by introducing a novel quicksort-flavored algorithm that reduces computational complexity and achieves significantly better results than simpler decomposable loss functions with comparable training time.

The accuracy of information retrieval systems is often measured using complex loss functions such as the average precision (AP) or the normalized discounted cumulative gain (NDCG). Given a set of positive and negative samples, the parameters of a retrieval system can be estimated by minimizing these loss functions. However, the non-differentiability and non-decomposability of these loss functions does not allow for simple gradient based optimization algorithms. This issue is generally circumvented by either optimizing a structured hinge-loss upper bound to the loss function or by using asymptotic methods like the direct-loss minimization framework. Yet, the high computational complexity of loss-augmented inference, which is necessary for both the frameworks, prohibits its use in large training data sets. To alleviate this deficiency, we present a novel quicksort flavored algorithm for a large class of non-decomposable loss functions. We provide a complete characterization of the loss functions that are amenable to our algorithm, and show that it includes both AP and NDCG based loss functions. Furthermore, we prove that no comparison based algorithm can improve upon the computational complexity of our approach asymptotically. We demonstrate the effectiveness of our approach in the context of optimizing the structured hinge loss upper bound of AP and NDCG loss for learning models for a variety of vision tasks. We show that our approach provides significantly better results than simpler decomposable loss functions, while requiring a comparable training time.

View on arXiv PDF

Similar