MLLGOCApr 2

Random Coordinate Descent on the Wasserstein Space of Probability Measures

arXiv:2604.0160624.2
AI Analysis

This work addresses computational bottlenecks in machine learning and mean-field modeling for researchers and practitioners dealing with high-dimensional or ill-conditioned optimization problems, representing an incremental advancement by adapting coordinate descent to the Wasserstein space.

The paper tackled the problem of high computational overhead in optimization over probability measures with Wasserstein geometry by proposing randomized coordinate descent methods, achieving significant speedups over full-gradient approaches in numerical experiments on ill-conditioned energies.

Optimization over the space of probability measures endowed with the Wasserstein-2 geometry is central to modern machine learning and mean-field modeling. However, traditional methods relying on full Wasserstein gradients often suffer from high computational overhead in high-dimensional or ill-conditioned settings. We propose a randomized coordinate descent framework specifically designed for the Wasserstein manifold, introducing both Random Wasserstein Coordinate Descent (RWCD) and Random Wasserstein Coordinate Proximal{-Gradient} (RWCP) for composite objectives. By exploiting coordinate-wise structures, our methods adapt to anisotropic objective landscapes where full-gradient approaches typically struggle. We provide a rigorous convergence analysis across various landscape geometries, establishing guarantees under non-convex, Polyak-Łojasiewicz, and geodesically convex conditions. Our theoretical results mirror the classic convergence properties found in Euclidean space, revealing a compelling symmetry between coordinate descent on vectors and on probability measures. The developed techniques are inherently adaptive to the Wasserstein geometry and offer a robust analytical template that can be extended to other optimization solvers within the space of measures. Numerical experiments on ill-conditioned energies demonstrate that our framework offers significant speedups over conventional full-gradient methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes