LGMar 3

Transport Clustering: Solving Low-Rank Optimal Transport via Clustering

arXiv:2603.03578v1h-index: 6
Originality Highly original
AI Analysis

This addresses a computational bottleneck in optimal transport for researchers and practitioners, offering a scalable solution with theoretical guarantees, though it is incremental as it builds on existing low-rank OT methods.

The paper tackles the NP-hard problem of low-rank optimal transport by introducing transport clustering, which reduces it to a clustering problem and provides polynomial-time constant-factor approximation algorithms with proven bounds, outperforming existing solvers on synthetic and large-scale datasets.

Optimal transport (OT) finds a least cost transport plan between two probability distributions using a cost matrix defined on pairs of points. Unlike standard OT, which infers unstructured pointwise mappings, low-rank optimal transport explicitly constrains the rank of the transport plan to infer latent structure. This improves statistical stability and robustness, yields sharper parametric rates for estimating Wasserstein distances adaptive to the intrinsic rank, and generalizes $K$-means to co-clustering. These advantages, however, come at the cost of a non-convex and NP-hard optimization problem. We introduce transport clustering, an algorithm to compute a low-rank OT plan that reduces low-rank OT to a clustering problem on correspondences obtained from a full-rank $\textit{transport registration}$ step. We prove that this reduction yields polynomial-time, constant-factor approximation algorithms for low-rank OT: specifically, a $(1+γ)$ approximation for negative-type metrics and a $(1+γ+\sqrt{2γ}\,)$ approximation for kernel costs, where $γ\in [0,1]$ denotes the approximation ratio of the optimal full-rank solution relative to the low-rank optimal. Empirically, transport clustering outperforms existing low-rank OT solvers on synthetic benchmarks and large-scale, high-dimensional datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes