LG CV MLJun 17, 2015

Learning with a Wasserstein Loss

Charlie Frogner, Chiyuan Zhang, Hossein Mobahi, Mauricio Araya-Polo, Tomaso Poggio

arXiv:1506.05439v336.2674 citations

Originality Incremental advance

AI Analysis

This addresses the problem of improving multi-label predictions for applications like tag prediction, but it is incremental as it builds on existing Wasserstein approximations.

The paper tackles the challenge of multi-label learning by developing a loss function based on the Wasserstein distance, which improves predictions by leveraging a natural metric on outputs, and demonstrates this on a tag prediction problem with the Yahoo Flickr Creative Commons dataset, outperforming a baseline.

Learning to predict multi-label outputs is challenging, but in many problems there is a natural metric on the outputs that can be used to improve predictions. In this paper we develop a loss function for multi-label learning, based on the Wasserstein distance. The Wasserstein distance provides a natural notion of dissimilarity for probability measures. Although optimizing with respect to the exact Wasserstein distance is costly, recent work has described a regularized approximation that is efficiently computed. We describe an efficient learning algorithm based on this regularization, as well as a novel extension of the Wasserstein distance from probability measures to unnormalized measures. We also describe a statistical learning bound for the loss. The Wasserstein loss can encourage smoothness of the predictions with respect to a chosen metric on the output space. We demonstrate this property on a real-data tag prediction problem, using the Yahoo Flickr Creative Commons dataset, outperforming a baseline that doesn't use the metric.

View on arXiv PDF

Similar