MLLGJul 14, 2020

Estimating Barycenters of Measures in High Dimensions

arXiv:2007.07105v228 citations
AI Analysis

This addresses a computational bottleneck for researchers and practitioners in machine learning and statistics dealing with high-dimensional data, representing a novel method rather than an incremental improvement.

The paper tackles the problem of estimating barycenters of measures in high dimensions, where existing methods fail due to scalability issues, and proposes a scalable algorithm that achieves good performance and scales to thousands of dimensions.

Barycentric averaging is a principled way of summarizing populations of measures. Existing algorithms for estimating barycenters typically parametrize them as weighted sums of Diracs and optimize their weights and/or locations. However, these approaches do not scale to high-dimensional settings due to the curse of dimensionality. In this paper, we propose a scalable and general algorithm for estimating barycenters of measures in high dimensions. The key idea is to turn the optimization over measures into an optimization over generative models, introducing inductive biases that allow the method to scale while still accurately estimating barycenters. We prove local convergence under mild assumptions on the discrepancy showing that the approach is well-posed. We demonstrate that our method is fast, achieves good performance on low-dimensional problems, and scales to high-dimensional settings. In particular, our approach is the first to be used to estimate barycenters in thousands of dimensions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes