LGJun 15, 2023Code
A Gromov--Wasserstein Geometric View of Spectrum-Preserving Graph CoarseningYifan Chen, Rentian Yao, Yun Yang et al.
Graph coarsening is a technique for solving large-scale graph problems by working on a smaller version of the original graph, and possibly interpolating the results back to the original graph. It has a long history in scientific computing and has recently gained popularity in machine learning, particularly in methods that preserve the graph spectrum. This work studies graph coarsening from a different perspective, developing a theory for preserving graph distances and proposing a method to achieve this. The geometric approach is useful when working with a collection of graphs, such as in graph classification and regression. In this study, we consider a graph as an element on a metric space equipped with the Gromov--Wasserstein (GW) distance, and bound the difference between the distance of two graphs and their coarsened versions. Minimizing this difference can be done using the popular weighted kernel $K$-means method, which improves existing spectrum-preserving methods with the proper choice of the kernel. The study includes a set of experiments to support the theory and method, including approximating the GW distance, preserving the graph spectrum, classifying graphs using spectral information, and performing regression using graph convolutional networks. Code is available at https://github.com/ychen-stat-ml/GW-Graph-Coarsening .
PRApr 16
Quantitative Stability of Many-Marginal Schrodinger BridgeRentian Yao, Young-Heon Kim, Geoffrey Schiebinger
In this paper, we explore quantitative stability of multi-marginal Schrödinger bridges with respect to the marginal constraints. We focus on the case where the number of marginal constraints is large (i.e. ``many-marginals"). When this number increases, we show that the Kullback--Leibler (KL) divergence between two multi-marginal Schrödinger bridges, as measures on the path space, can be asymptotically bounded by the terminal marginal KL divergence and a time-integrated squared discrepancy {that combines} Wasserstein-2 geodesic velocity fields with a log-density gradient term. Our stability upper bound is also asymptotically tight: it converges to zero as the number of marginal constraints increases with unperturbed marginal constraints. To the best of our knowledge, this is the first such stability result that addresses the many-marginal regime, giving error estimates that are asymptotically independent of the number of marginals. To achieve our result, the key step is to derive an asymptotic expansion (of order $k\ge 2$) of Schrödinger potentials with respect to a diminishing regularization coefficient. This result can also be applied to deriving asymptotic expansions of entropic Brenier maps in entropic optimal self-transport problems. As byproducts of our analyses, we also establish the asymptotic expansion of entropic optimal transport cost with respect to the diminishing regularization coefficient when two marginal constraints are sufficiently close. We also prove a stability property of the Schrödinger functional.
MLJan 24, 2025
Optimal Transport Barycenter via Nonconvex-Concave Minimax OptimizationKaheon Kim, Rentian Yao, Changbo Zhu et al.
The optimal transport barycenter (a.k.a. Wasserstein barycenter) is a fundamental notion of averaging that extends from the Euclidean space to the Wasserstein space of probability distributions. Computation of the unregularized barycenter for discretized probability distributions on point clouds is a challenging task when the domain dimension $d > 1$. Most practical algorithms for approximating the barycenter problem are based on entropic regularization. In this paper, we introduce a nearly linear time $O(m \log{m})$ and linear space complexity $O(m)$ primal-dual algorithm, the Wasserstein-Descent $\dot{\mathbb{H}}^1$-Ascent (WDHA) algorithm, for computing the exact barycenter when the input probability density functions are discretized on an $m$-point grid. The key success of the WDHA algorithm hinges on alternating between two different yet closely related Wasserstein and Sobolev optimization geometries for the primal barycenter and dual Kantorovich potential subproblems. Under reasonable assumptions, we establish the convergence rate and iteration complexity of WDHA to its stationary point when the step size is appropriately chosen. Superior computational efficacy, scalability, and accuracy over the existing Sinkhorn-type algorithms are demonstrated on high-resolution (e.g., $1024 \times 1024$ images) 2D synthetic and real data.
MLOct 26, 2025
Statistical Analysis of the Sinkhorn Iterations for Two-Sample Schrödinger Bridge EstimationIbuki Maeda, Rentian Yao, Atsushi Nitanda
The Schrödinger bridge problem seeks the optimal stochastic process that connects two given probability distributions with minimal energy modification. While the Sinkhorn algorithm is widely used to solve the static optimal transport problem, a recent work (Pooladian and Niles-Weed, 2024) proposed the Sinkhorn bridge, which estimates Schrödinger bridges by plugging optimal transport into the time-dependent drifts of SDEs, with statistical guarantees in the one-sample estimation setting where the true source distribution is fully accessible. In this work, to further justify this method, we study the statistical performance of intermediate Sinkhorn iterations in the two-sample estimation setting, where only finite samples from both source and target distributions are available. Specifically, we establish a statistical bound on the squared total variation error of Sinkhorn bridge iterations: $O(1/m+1/n + r^{4k})~(r \in (0,1))$, where $m$ and $n$ are the sample sizes from the source and target distributions, respectively, and $k$ is the number of Sinkhorn iterations. This result provides a theoretical guarantee for the finite-sample performance of the Schrödinger bridge estimator and offers practical guidance for selecting sample sizes and the number of Sinkhorn iterations. Notably, our theoretical results apply to several representative methods such as [SF]$^2$M, DSBM-IMF, BM2, and LightSB(-M) under specific settings, through the previously unnoticed connection between these estimators.