Structured Matching via Cost-Regularized Unbalanced Optimal Transport
This addresses a challenge in computational biology for matching heterogeneous datasets, but it is incremental as it builds on existing unbalanced optimal transport methods.
The paper tackles the problem of matching datasets in heterogeneous spaces by introducing cost-regularized unbalanced optimal transport (CR-UOT), which allows the ground cost to vary and improves alignment of single-cell omics profiles, particularly when many cells lack direct matches.
Unbalanced optimal transport (UOT) provides a flexible way to match or compare nonnegative finite Radon measures. However, UOT requires a predefined ground transport cost, which may misrepresent the data's underlying geometry. Choosing such a cost is particularly challenging when datasets live in heterogeneous spaces, often motivating practitioners to adopt Gromov-Wasserstein formulations. To address this challenge, we introduce cost-regularized unbalanced optimal transport (CR-UOT), a framework that allows the ground cost to vary while allowing mass creation and removal. We show that CR-UOT incorporates unbalanced Gromov-Wasserstein type problems through families of inner-product costs parameterized by linear transformations, enabling the matching of measures or point clouds across Euclidean spaces. We develop algorithms for such CR-UOT problems using entropic regularization and demonstrate that this approach improves the alignment of heterogeneous single-cell omics profiles, especially when many cells lack direct matches.