A Direct $\tilde{O}(1/ε)$ Iteration Parallel Algorithm for Optimal Transport
This work addresses the computational bottleneck of optimal transport in machine learning and statistics by providing a more efficient parallel algorithm, though it is incremental as it builds on prior theoretical improvements.
The paper tackles the problem of computing optimal transport distances between distributions, presenting a parallel algorithm that achieves additive error ε with Õ(1/ε) iteration depth and Õ(n²/ε) work, which is optimal for first-order methods and bridges a gap with practical algorithms.
Optimal transportation, or computing the Wasserstein or ``earth mover's'' distance between two distributions, is a fundamental primitive which arises in many learning and statistical settings. We give an algorithm which solves this problem to additive $ε$ with $\tilde{O}(1/ε)$ parallel depth, and $\tilde{O}\left(n^2/ε\right)$ work. Barring a breakthrough on a long-standing algorithmic open problem, this is optimal for first-order methods. Blanchet et. al. '18, Quanrud '19 obtained similar runtimes through reductions to positive linear programming and matrix scaling. However, these reduction-based algorithms use complicated subroutines which may be deemed impractical due to requiring solvers for second-order iterations (matrix scaling) or non-parallelizability (positive LP). The fastest practical algorithms run in time $\tilde{O}(\min(n^2 / ε^2, n^{2.5} / ε))$ (Dvurechensky et. al. '18, Lin et. al. '19). We bridge this gap by providing a parallel, first-order, $\tilde{O}(1/ε)$ iteration algorithm without worse dependence on dimension, and provide preliminary experimental evidence that our algorithm may enjoy improved practical performance. We obtain this runtime via a primal-dual extragradient method, motivated by recent theoretical improvements to maximum flow (Sherman '17).