OCLGMLJan 21, 2020

Stochastic Approximation versus Sample Average Approximation for population Wasserstein barycenters

arXiv:2001.07697v911 citations
AI Analysis

This addresses the computational efficiency of optimal transport methods for machine learning practitioners, offering an incremental improvement by inverting the expected superiority of SA over SAA in specific cases.

The paper compares Stochastic Approximation (SA) and Sample Average Approximation (SAA) for computing population Wasserstein barycenters, showing that SAA can outperform SA in terms of complexity for this problem, contrary to prior beliefs, and provides complexity bounds and confidence intervals.

In the machine learning and optimization community, there are two main approaches for the convex risk minimization problem, namely, the Stochastic Approximation (SA) and the Sample Average Approximation (SAA). In terms of oracle complexity (required number of stochastic gradient evaluations), both approaches are considered equivalent on average (up to a logarithmic factor). The total complexity depends on the specific problem, however, starting from work \cite{nemirovski2009robust} it was generally accepted that the SA is better than the SAA. % Nevertheless, in case of large-scale problems SA may run out of memory as storing all data on one machine and organizing online access to it can be impossible without communications with other machines. SAA in contradistinction to SA allows parallel/distributed calculations. We show that for the Wasserstein barycenter problem this superiority can be inverted. We provide a detailed comparison by stating the complexity bounds for the SA and the SAA implementations calculating barycenters defined with respect to optimal transport distances and entropy-regularized optimal transport distances. As a byproduct, we also construct confidence intervals for the barycenter defined with respect to entropy-regularized optimal transport distances in the $\ell_2$-norm. The preliminary results are derived for a general convex optimization problem given by the expectation in order to have other applications besides the Wasserstein barycenter problem.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes