Differentiable Generalized Sliced Wasserstein Plans
This work addresses scalability issues in OT for machine learning practitioners dealing with large datasets or high-dimensional data, though it is incremental as it builds on existing slicing methods.
The paper tackles the computational bottleneck of Optimal Transport (OT) in high dimensions by proposing a differentiable approximation scheme for min-SWGG, a slicing method, enabling efficient identification of optimal slices and extending it to data on manifolds. It demonstrates practical value in applications like gradient flows and image generation, with improved efficiency in high-dimensional settings.
Optimal Transport (OT) has attracted significant interest in the machine learning community, not only for its ability to define meaningful distances between probability distributions -- such as the Wasserstein distance -- but also for its formulation of OT plans. Its computational complexity remains a bottleneck, though, and slicing techniques have been developed to scale OT to large datasets. Recently, a novel slicing scheme, dubbed min-SWGG, lifts a single one-dimensional plan back to the original multidimensional space, finally selecting the slice that yields the lowest Wasserstein distance as an approximation of the full OT plan. Despite its computational and theoretical advantages, min-SWGG inherits typical limitations of slicing methods: (i) the number of required slices grows exponentially with the data dimension, and (ii) it is constrained to linear projections. Here, we reformulate min-SWGG as a bilevel optimization problem and propose a differentiable approximation scheme to efficiently identify the optimal slice, even in high-dimensional settings. We furthermore define its generalized extension for accommodating to data living on manifolds. Finally, we demonstrate the practical value of our approach in various applications, including gradient flows on manifolds and high-dimensional spaces, as well as a novel sliced OT-based conditional flow matching for image generation -- where fast computation of transport plans is essential.