LGFeb 9, 2023
The Monge Gap: A Regularizer to Learn All Transport MapsThéo Uscidda, Marco Cuturi · apple-ml
Optimal transport (OT) theory has been been used in machine learning to study and characterize maps that can push-forward efficiently a probability measure onto another. Recent works have drawn inspiration from Brenier's theorem, which states that when the ground cost is the squared-Euclidean distance, the ``best'' map to morph a continuous measure in $\mathcal{P}(\Rd)$ into another must be the gradient of a convex function. To exploit that result, [Makkuva+ 2020, Korotin+2020] consider maps $T=\nabla f_θ$, where $f_θ$ is an input convex neural network (ICNN), as defined by Amos+2017, and fit $θ$ with SGD using samples. Despite their mathematical elegance, fitting OT maps with ICNNs raises many challenges, due notably to the many constraints imposed on $θ$; the need to approximate the conjugate of $f_θ$; or the limitation that they only work for the squared-Euclidean cost. More generally, we question the relevance of using Brenier's result, which only applies to densities, to constrain the architecture of candidate maps fitted on samples. Motivated by these limitations, we propose a radically different approach to estimating OT maps: Given a cost $c$ and a reference measure $ρ$, we introduce a regularizer, the Monge gap $\mathcal{M}^c_ρ(T)$ of a map $T$. That gap quantifies how far a map $T$ deviates from the ideal properties we expect from a $c$-OT map. In practice, we drop all architecture requirements for $T$ and simply minimize a distance (e.g., the Sinkhorn divergence) between $T\sharpμ$ and $ν$, regularized by $\mathcal{M}^c_ρ(T)$. We study $\mathcal{M}^c_ρ$, and show how our simple pipeline outperforms significantly other baselines in practice.
CVNov 25, 2023
Unbalancedness in Neural Monge Maps Improves Unpaired Domain TranslationLuca Eyring, Dominik Klein, Théo Uscidda et al.
In optimal transport (OT), a Monge map is known as a mapping that transports a source distribution to a target distribution in the most cost-efficient way. Recently, multiple neural estimators for Monge maps have been developed and applied in diverse unpaired domain translation tasks, e.g. in single-cell biology and computer vision. However, the classic OT framework enforces mass conservation, which makes it prone to outliers and limits its applicability in real-world scenarios. The latter can be particularly harmful in OT domain translation tasks, where the relative position of a sample within a distribution is explicitly taken into account. While unbalanced OT tackles this challenge in the discrete setting, its integration into neural Monge map estimators has received limited attention. We propose a theoretically grounded method to incorporate unbalancedness into any Monge map estimator. We improve existing estimators to model cell trajectories over time and to predict cellular responses to perturbations. Moreover, our approach seamlessly integrates with the OT flow matching (OT-FM) framework. While we show that OT-FM performs competitively in image translation, we further improve performance by incorporating unbalancedness (UOT-FM), which better preserves relevant features. We hence establish UOT-FM as a principled method for unpaired image translation.
MLOct 13, 2023
GENOT: Entropic (Gromov) Wasserstein Flow Matching with Applications to Single-Cell GenomicsDominik Klein, Théo Uscidda, Fabian Theis et al.
Single-cell genomics has significantly advanced our understanding of cellular behavior, catalyzing innovations in treatments and precision medicine. However, single-cell sequencing technologies are inherently destructive and can only measure a limited array of data modalities simultaneously. This limitation underscores the need for new methods capable of realigning cells. Optimal transport (OT) has emerged as a potent solution, but traditional discrete solvers are hampered by scalability, privacy, and out-of-sample estimation issues. These challenges have spurred the development of neural network-based solvers, known as neural OT solvers, that parameterize OT maps. Yet, these models often lack the flexibility needed for broader life science applications. To address these deficiencies, our approach learns stochastic maps (i.e. transport plans), allows for any cost function, relaxes mass conservation constraints and integrates quadratic solvers to tackle the complex challenges posed by the (Fused) Gromov-Wasserstein problem. Utilizing flow matching as a backbone, our method offers a flexible and effective framework. We demonstrate its versatility and robustness through applications in cell development studies, cellular drug response modeling, and cross-modality cell translation, illustrating significant potential for enhancing therapeutic strategies.
LGJul 10, 2024
Disentangled Representation Learning with the Gromov-Monge GapThéo Uscidda, Luca Eyring, Karsten Roth et al.
Learning disentangled representations from unlabelled data is a fundamental challenge in machine learning. Solving it may unlock other problems, such as generalization, interpretability, or fairness. Although remarkably challenging to solve in theory, disentanglement is often achieved in practice through prior matching. Furthermore, recent works have shown that prior matching approaches can be enhanced by leveraging geometrical considerations, e.g., by learning representations that preserve geometric features of the data, such as distances or angles between points. However, matching the prior while preserving geometric features is challenging, as a mapping that fully preserves these features while aligning the data distribution with the prior does not exist in general. To address these challenges, we introduce a novel approach to disentangled representation learning based on quadratic optimal transport. We formulate the problem using Gromov-Monge maps that transport one distribution onto another with minimal distortion of predefined geometric features, preserving them as much as can be achieved. To compute such maps, we propose the Gromov-Monge-Gap (GMG), a regularizer quantifying whether a map moves a reference distribution with minimal geometry distortion. We demonstrate the effectiveness of our approach for disentanglement across four standard benchmarks, outperforming other methods leveraging geometric considerations.
LGOct 15, 2024Code
GeOT: A spatially explicit framework for evaluating spatio-temporal predictionsNina Wiedemann, Théo Uscidda, Martin Raubal
When predicting observations across space and time, the spatial layout of errors impacts a model's real-world utility. For instance, in bike sharing demand prediction, error patterns translate to relocation costs. However, commonly used error metrics in GeoAI evaluate predictions point-wise, neglecting effects such as spatial heterogeneity, autocorrelation, and the Modifiable Areal Unit Problem. We put forward Optimal Transport (OT) as a spatial evaluation metric and loss function. The proposed framework, called GeOT, assesses the performance of prediction models by quantifying the transport costs associated with their prediction errors. Through experiments on real and synthetic data, we demonstrate that 1) the spatial distribution of prediction errors relates to real-world costs in many applications, 2) OT captures these spatial costs more accurately than existing metrics, and 3) OT enhances comparability across spatial and temporal scales. Finally, we advocate for leveraging OT as a loss function in neural networks to improve the spatial accuracy of predictions. Experiments with bike sharing, charging station, and traffic datasets show that spatial costs are significantly reduced with only marginal changes to non-spatial error metrics. Thus, this approach not only offers a spatially explicit tool for model evaluation and selection, but also integrates spatial considerations into model training. All code is available at https://github.com/mie-lab/geospatialOT.
OCJun 13, 2024
Mirror and Preconditioned Gradient Descent in Wasserstein SpaceClément Bonet, Théo Uscidda, Adam David et al.
As the problem of minimizing functionals on the Wasserstein space encompasses many applications in machine learning, different optimization algorithms on $\mathbb{R}^d$ have received their counterpart analog on the Wasserstein space. We focus here on lifting two explicit algorithms: mirror descent and preconditioned gradient descent. These algorithms have been introduced to better capture the geometry of the function to minimize and are provably convergent under appropriate (namely relative) smoothness and convexity conditions. Adapting these notions to the Wasserstein space, we prove guarantees of convergence of some Wasserstein-gradient-based discrete-time schemes for new pairings of objective functionals and regularizers. The difficulty here is to carefully select along which curves the functionals should be smooth and convex. We illustrate the advantages of adapting the geometry induced by the regularizer on ill-conditioned optimization tasks, and showcase the improvement of choosing different discrepancies and geometries in a computational biology task of aligning single-cells.