DSLGPROct 18, 2021

Dimensionality Reduction for Wasserstein Barycenter

arXiv:2110.08991v223 citations
AI Analysis

This addresses a computational bottleneck for researchers and practitioners using Wasserstein barycenters in high-dimensional machine learning applications, offering a significant speedup with theoretical guarantees.

The paper tackles the curse of dimensionality in computing Wasserstein barycenters by showing that randomized dimensionality reduction can map the problem to a space of dimension O(log n) with cost preservation, and provides matching bounds and coresets to improve computation time, with experimental validation of speedup.

The Wasserstein barycenter is a geometric construct which captures the notion of centrality among probability distributions, and which has found many applications in machine learning. However, most algorithms for finding even an approximate barycenter suffer an exponential dependence on the dimension $d$ of the underlying space of the distributions. In order to cope with this "curse of dimensionality," we study dimensionality reduction techniques for the Wasserstein barycenter problem. When the barycenter is restricted to support of size $n$, we show that randomized dimensionality reduction can be used to map the problem to a space of dimension $O(\log n)$ independent of both $d$ and $k$, and that \emph{any} solution found in the reduced dimension will have its cost preserved up to arbitrary small error in the original space. We provide matching upper and lower bounds on the size of the reduced dimension, showing that our methods are optimal up to constant factors. We also provide a coreset construction for the Wasserstein barycenter problem that significantly decreases the number of input distributions. The coresets can be used in conjunction with random projections and thus further improve computation time. Lastly, our experimental results validate the speedup provided by dimensionality reduction while maintaining solution quality.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes