LGNov 2, 2022Code
Entropic Neural Optimal Transport via Diffusion ProcessesNikita Gushchin, Alexander Kolesov, Alexander Korotin et al.
We propose a novel neural algorithm for the fundamental problem of computing the entropic optimal transport (EOT) plan between continuous probability distributions which are accessible by samples. Our algorithm is based on the saddle point reformulation of the dynamic version of EOT which is known as the Schrödinger Bridge problem. In contrast to the prior methods for large-scale EOT, our algorithm is end-to-end and consists of a single learning step, has fast inference procedure, and allows handling small values of the entropy regularization coefficient which is of particular importance in some applied problems. Empirically, we show the performance of the method on several large-scale EOT tasks. https://github.com/ngushchin/EntropicNeuralOptimalTransport
LGApr 12, 2023Code
Energy-guided Entropic Neural Optimal TransportPetr Mokrov, Alexander Korotin, Alexander Kolesov et al.
Energy-based models (EBMs) are known in the Machine Learning community for decades. Since the seminal works devoted to EBMs dating back to the noughties, there have been a lot of efficient methods which solve the generative modelling problem by means of energy potentials (unnormalized likelihood functions). In contrast, the realm of Optimal Transport (OT) and, in particular, neural OT solvers is much less explored and limited by few recent works (excluding WGAN-based approaches which utilize OT as a loss function and do not model OT maps themselves). In our work, we bridge the gap between EBMs and Entropy-regularized OT. We present a novel methodology which allows utilizing the recent developments and technical improvements of the former in order to enrich the latter. From the theoretical perspective, we prove generalization bounds for our technique. In practice, we validate its applicability in toy 2D and image domains. To showcase the scalability, we empower our method with a pre-trained StyleGAN and apply it to high-res AFHQ $512\times 512$ unpaired I2I translation. For simplicity, we choose simple short- and long-run EBMs as a backbone of our Energy-guided Entropic OT approach, leaving the application of more sophisticated EBMs for future research. Our code is available at: https://github.com/PetrMokrov/Energy-guided-Entropic-OT
LGJun 16, 2023Code
Building the Bridge of Schrödinger: A Continuous Entropic Optimal Transport BenchmarkNikita Gushchin, Alexander Kolesov, Petr Mokrov et al.
Over the last several years, there has been significant progress in developing neural solvers for the Schrödinger Bridge (SB) problem and applying them to generative modelling. This new research field is justifiably fruitful as it is interconnected with the practically well-performing diffusion models and theoretically grounded entropic optimal transport (EOT). Still, the area lacks non-trivial tests allowing a researcher to understand how well the methods solve SB or its equivalent continuous EOT problem. We fill this gap and propose a novel way to create pairs of probability distributions for which the ground truth OT solution is known by the construction. Our methodology is generic and works for a wide range of OT formulations, in particular, it covers the EOT which is equivalent to SB (the main interest of our study). This development allows us to create continuous benchmark distributions with the known EOT and SB solutions on high-dimensional spaces such as spaces of images. As an illustration, we use these benchmark pairs to test how well existing neural EOT/SB solvers actually compute the EOT solution. Our code for constructing benchmark pairs under different setups is available at: https://github.com/ngushchin/EntropicOTBenchmark.
LGOct 2, 2023Code
Energy-Guided Continuous Entropic Barycenter Estimation for General CostsAlexander Kolesov, Petr Mokrov, Igor Udovichenko et al.
Optimal transport (OT) barycenters are a mathematically grounded way of averaging probability distributions while capturing their geometric properties. In short, the barycenter task is to take the average of a collection of probability distributions w.r.t. given OT discrepancies. We propose a novel algorithm for approximating the continuous Entropic OT (EOT) barycenter for arbitrary OT cost functions. Our approach is built upon the dual reformulation of the EOT problem based on weak OT, which has recently gained the attention of the ML community. Beyond its novelty, our method enjoys several advantageous properties: (i) we establish quality bounds for the recovered solution; (ii) this approach seamlessly interconnects with the Energy-Based Models (EBMs) learning procedure enabling the use of well-tuned algorithms for the problem of interest; (iii) it provides an intuitive optimization scheme avoiding min-max, reinforce and other intricate technical tricks. For validation, we consider several low-dimensional scenarios and image-space setups, including non-Euclidean cost functions. Furthermore, we investigate the practical task of learning the barycenter on an image manifold generated by a pretrained generative model, opening up new directions for real-world applications. Our code is available at https://github.com/justkolesov/EnergyGuidedBarycenters.
LGJun 15, 2022
Kantorovich Strikes Back! Wasserstein GANs are not Optimal Transport?Alexander Korotin, Alexander Kolesov, Evgeny Burnaev
Wasserstein Generative Adversarial Networks (WGANs) are the popular generative models built on the theory of Optimal Transport (OT) and the Kantorovich duality. Despite the success of WGANs, it is still unclear how well the underlying OT dual solvers approximate the OT cost (Wasserstein-1 distance, $\mathbb{W}_{1}$) and the OT gradient needed to update the generator. In this paper, we address these questions. We construct 1-Lipschitz functions and use them to build ray monotone transport plans. This strategy yields pairs of continuous benchmark distributions with the analytically known OT plan, OT cost and OT gradient in high-dimensional spaces such as spaces of images. We thoroughly evaluate popular WGAN dual form solvers (gradient penalty, spectral normalization, entropic regularization, etc.) using these benchmark pairs. Even though these solvers perform well in WGANs, none of them faithfully compute $\mathbb{W}_{1}$ in high dimensions. Nevertheless, many provide a meaningful approximation of the OT gradient. These observations suggest that these solvers should not be treated as good estimators of $\mathbb{W}_{1}$, but to some extent they indeed can be used in variational problems requiring the minimization of $\mathbb{W}_{1}$.
LGFeb 6, 2024Code
Estimating Barycenters of Distributions with Neural Optimal TransportAlexander Kolesov, Petr Mokrov, Igor Udovichenko et al.
Given a collection of probability measures, a practitioner sometimes needs to find an "average" distribution which adequately aggregates reference distributions. A theoretically appealing notion of such an average is the Wasserstein barycenter, which is the primal focus of our work. By building upon the dual formulation of Optimal Transport (OT), we propose a new scalable approach for solving the Wasserstein barycenter problem. Our methodology is based on the recent Neural OT solver: it has bi-level adversarial learning objective and works for general cost functions. These are key advantages of our method since the typical adversarial algorithms leveraging barycenter tasks utilize tri-level optimization and focus mostly on quadratic cost. We also establish theoretical error bounds for our proposed approach and showcase its applicability and effectiveness in illustrative scenarios and image data setups. Our source code is available at https://github.com/justkolesov/NOTBarycenters.
LGFeb 4, 2025Code
Field Matching: an Electrostatic Paradigm to Generate and Transfer DataAlexander Kolesov, Manukhov Stepan, Vladimir V. Palyulin et al.
We propose Electrostatic Field Matching (EFM), a novel method that is suitable for both generative modeling and distribution transfer tasks. Our approach is inspired by the physics of an electrical capacitor. We place source and target distributions on the capacitor plates and assign them positive and negative charges, respectively. Then we learn the electrostatic field of the capacitor using a neural network approximator. To map the distributions to each other, we start at one plate of the capacitor and move the samples along the learned electrostatic field lines until they reach the other plate. We theoretically justify that this approach provably yields the distribution transfer. In practice, we demonstrate the performance of our EFM in toy and image data experiments. Our code is available at https://github.com/justkolesov/FieldMatching
LGJun 3, 2025
Interaction Field Matching: Overcoming Limitations of Electrostatic ModelsStepan I. Manukhov, Alexander Kolesov, Vladimir V. Palyulin et al.
Electrostatic field matching (EFM) has recently appeared as a novel physics-inspired paradigm for data generation and transfer using the idea of an electric capacitor. However, it requires modeling electrostatic fields using neural networks, which is non-trivial because of the necessity to take into account the complex field outside the capacitor plates. In this paper, we propose Interaction Field Matching (IFM), a generalization of EFM which allows using general interaction fields beyond the electrostatic one. Furthermore, inspired by strong interactions between quarks and antiquarks in physics, we design a particular interaction field realization which solves the problems which arise when modeling electrostatic fields in EFM. We show the performance on a series of toy and image data transfer problems.
LGSep 28, 2025
Electric Currents for Discrete Data GenerationAlexander Kolesov, Stepan Manukhov, Vladimir V. Palyulin et al.
We propose $\textbf{E}$lectric $\textbf{C}$urrent $\textbf{D}$iscrete $\textbf{D}$ata $\textbf{G}$eneration (ECD$^{2}$G), a pioneering method for data generation in discrete settings that is grounded in electrical engineering theory. Our approach draws an analogy between electric current flow in a circuit and the transfer of probability mass between data distributions. We interpret samples from the source distribution as current input nodes of a circuit and samples from the target distribution as current output nodes. A neural network is then used to learn the electric currents to represent the probability flow in the circuit. To map the source distribution to the target, we sample from the source and transport these samples along the circuit pathways according to the learned currents. This process provably guarantees transfer between data distributions. We present proof-of-concept experiments to illustrate our ECD$^{2}$G method.