Linear convergence of proximal descent schemes on the Wasserstein space
This work addresses optimization challenges in machine learning and statistics by providing a theoretical foundation for faster convergence in Wasserstein space, though it is incremental as it builds on prior analyses.
The paper tackles the problem of optimizing entropy-regularized functionals on the Wasserstein space by establishing linear convergence of proximal descent methods under flat convexity assumptions, relaxing the reliance on geodesic convexity and achieving this through a uniform logarithmic Sobolev inequality and entropy sandwich lemma.
We investigate proximal descent methods, inspired by the minimizing movement scheme introduced by Jordan, Kinderlehrer and Otto, for optimizing entropy-regularized functionals on the Wasserstein space. We establish linear convergence under flat convexity assumptions, thereby relaxing the common reliance on geodesic convexity. Our analysis circumvents the need for discrete-time adaptations of the Evolution Variational Inequality (EVI). Instead, we leverage a uniform logarithmic Sobolev inequality (LSI) and the entropy "sandwich" lemma, extending the analysis from arXiv:2201.10469 and arXiv:2202.01009. The major challenge in the proof via LSI is to show that the relative Fisher information $I(\cdot|π)$ is well-defined at every step of the scheme. Since the relative entropy is not Wasserstein differentiable, we prove that along the scheme the iterates belong to a certain class of Sobolev regularity, and hence the relative entropy $\operatorname{KL}(\cdot|π)$ has a unique Wasserstein sub-gradient, and that the relative Fisher information is indeed finite.