Do Neural Optimal Transport Solvers Work? A Continuous Wasserstein-2 Benchmark
This work addresses a critical evaluation gap for researchers in machine learning, particularly those using optimal transport, by providing a benchmark that exposes limitations in existing solvers.
The authors tackled the lack of a standard benchmark for evaluating neural network-based optimal transport solvers by constructing continuous measures with analytically known ground truth maps, revealing that many solvers fail to accurately recover optimal transport maps despite performing well in downstream tasks.
Despite the recent popularity of neural network-based solvers for optimal transport (OT), there is no standard quantitative way to evaluate their performance. In this paper, we address this issue for quadratic-cost transport -- specifically, computation of the Wasserstein-2 distance, a commonly-used formulation of optimal transport in machine learning. To overcome the challenge of computing ground truth transport maps between continuous measures needed to assess these solvers, we use input-convex neural networks (ICNN) to construct pairs of measures whose ground truth OT maps can be obtained analytically. This strategy yields pairs of continuous benchmark measures in high-dimensional spaces such as spaces of images. We thoroughly evaluate existing optimal transport solvers using these benchmark measures. Even though these solvers perform well in downstream tasks, many do not faithfully recover optimal transport maps. To investigate the cause of this discrepancy, we further test the solvers in a setting of image generation. Our study reveals crucial limitations of existing solvers and shows that increased OT accuracy does not necessarily correlate to better results downstream.