LG CG GNApr 15, 2024

Wasserstein Wormhole: Scalable Optimal Transport Distance with Transformers

Doron Haviv, Russell Zhang Kunes, Thomas Dougherty, Cassandra Burdziak, Tal Nawy, Anna Gilbert, Dana Pe'er

arXiv:2404.09411v415.016 citationsh-index: 30ICML

Originality Highly original

AI Analysis

This work addresses scalability issues in optimal transport for researchers in computational geometry and single-cell biology, offering a novel method to make OT approaches more practical for large-scale data analysis.

The paper tackled the intractability of computing pairwise Wasserstein distances for large cohorts by developing Wasserstein Wormhole, a transformer-based autoencoder that embeds distributions into a latent space where Euclidean distances approximate OT distances, enabling linear time computation and closely matching Wasserstein distances empirically.

Optimal transport (OT) and the related Wasserstein metric (W) are powerful and ubiquitous tools for comparing distributions. However, computing pairwise Wasserstein distances rapidly becomes intractable as cohort size grows. An attractive alternative would be to find an embedding space in which pairwise Euclidean distances map to OT distances, akin to standard multidimensional scaling (MDS). We present Wasserstein Wormhole, a transformer-based autoencoder that embeds empirical distributions into a latent space wherein Euclidean distances approximate OT distances. Extending MDS theory, we show that our objective function implies a bound on the error incurred when embedding non-Euclidean distances. Empirically, distances between Wormhole embeddings closely match Wasserstein distances, enabling linear time computation of OT distances. Along with an encoder that maps distributions to embeddings, Wasserstein Wormhole includes a decoder that maps embeddings back to distributions, allowing for operations in the embedding space to generalize to OT spaces, such as Wasserstein barycenter estimation and OT interpolation. By lending scalability and interpretability to OT approaches, Wasserstein Wormhole unlocks new avenues for data analysis in the fields of computational geometry and single-cell biology.

View on arXiv PDF

Similar