MLLGOct 9, 2019

Learning with minibatch Wasserstein : asymptotic and gradient properties

arXiv:1910.04091v4108 citations
Originality Incremental advance
AI Analysis

This work addresses scalability issues in machine learning for practitioners using optimal transport, though it is incremental as it analyzes an existing practice rather than introducing a new method.

The paper tackles the computational challenge of optimal transport distances on large datasets by analyzing the practice of using minibatches, showing it acts as an implicit regularization with properties like unbiased estimators and gradients, but loses distance properties, and empirical experiments on tasks like GANs demonstrate its practical utility.

Optimal transport distances are powerful tools to compare probability distributions and have found many applications in machine learning. Yet their algorithmic complexity prevents their direct use on large scale datasets. To overcome this challenge, practitioners compute these distances on minibatches {\em i.e.} they average the outcome of several smaller optimal transport problems. We propose in this paper an analysis of this practice, which effects are not well understood so far. We notably argue that it is equivalent to an implicit regularization of the original problem, with appealing properties such as unbiased estimators, gradients and a concentration bound around the expectation, but also with defects such as loss of distance property. Along with this theoretical analysis, we also conduct empirical experiments on gradient flows, GANs or color transfer that highlight the practical interest of this strategy.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes