MLOct 16, 2013

Fast Computation of Wasserstein Barycenters

arXiv:1310.4375v3814 citations
AI Analysis

This work addresses the computational bottleneck in optimal transport for researchers and practitioners in machine learning, offering incremental improvements over existing methods.

The authors tackled the computational challenge of computing Wasserstein barycenters by introducing two algorithms that use entropic regularization to smooth the Wasserstein distance, enabling faster gradient computation via matrix scaling. This approach reduced computational costs and was applied to visualize large image sets and solve constrained clustering problems.

We present new algorithms to compute the mean of a set of empirical probability measures under the optimal transport metric. This mean, known as the Wasserstein barycenter, is the measure that minimizes the sum of its Wasserstein distances to each element in that set. We propose two original algorithms to compute Wasserstein barycenters that build upon the subgradient method. A direct implementation of these algorithms is, however, too costly because it would require the repeated resolution of large primal and dual optimal transport problems to compute subgradients. Extending the work of Cuturi (2013), we propose to smooth the Wasserstein distance used in the definition of Wasserstein barycenters with an entropic regularizer and recover in doing so a strictly convex objective whose gradients can be computed for a considerably cheaper computational cost using matrix scaling algorithms. We use these algorithms to visualize a large family of images and to solve a constrained clustering problem.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes