ML LGNov 18, 2015

Metric Learning with Adaptive Density Discrimination

Oren Rippel, Manohar Paluri, Piotr Dollar, Lubomir Bourdev

arXiv:1511.05939v232.3222 citations

Originality Highly original

AI Analysis

This work addresses a critical bottleneck in fine-grained visual recognition by enhancing DML to compete with classification methods, offering significant speed and accuracy gains for computer vision applications.

The paper tackles the performance gap between distance metric learning (DML) and modern classification algorithms by proposing a novel approach that models class distributions in representation space to adaptively assess similarity and penalize overlap, achieving state-of-the-art classification results with 30-40% relative improvements over triplet loss and 5-30 times faster training convergence.

Distance metric learning (DML) approaches learn a transformation to a representation space where distance is in correspondence with a predefined notion of similarity. While such models offer a number of compelling benefits, it has been difficult for these to compete with modern classification algorithms in performance and even in feature extraction. In this work, we propose a novel approach explicitly designed to address a number of subtle yet important issues which have stymied earlier DML algorithms. It maintains an explicit model of the distributions of the different classes in representation space. It then employs this knowledge to adaptively assess similarity, and achieve local discrimination by penalizing class distribution overlap. We demonstrate the effectiveness of this idea on several tasks. Our approach achieves state-of-the-art classification results on a number of fine-grained visual recognition datasets, surpassing the standard softmax classifier and outperforming triplet loss by a relative margin of 30-40%. In terms of computational performance, it alleviates training inefficiencies in the traditional triplet loss, reaching the same error in 5-30 times fewer iterations. Beyond classification, we further validate the saliency of the learnt representations via their attribute concentration and hierarchy recovery properties, achieving 10-25% relative gains on the softmax classifier and 25-50% on triplet loss in these tasks.

View on arXiv PDF

Similar