Relative Wasserstein Angle and the Problem of the $W_2$-Nearest Gaussian Distribution
This work addresses the challenge of measuring deviations from Gaussianity for researchers in statistics and machine learning, offering a novel geometric approach with practical applications in distribution approximation.
The paper tackles the problem of quantifying non-Gaussianity in empirical distributions using optimal transport, introducing the relative Wasserstein angle and orthogonal projection distance, and shows that the proposed nearest Gaussian improves Fréchet Inception Distance scores over moment matching.
We study the problem of quantifying how far an empirical distribution deviates from Gaussianity under the framework of optimal transport. By exploiting the cone geometry of the relative translation invariant quadratic Wasserstein space, we introduce two novel geometric quantities, the relative Wasserstein angle and the orthogonal projection distance, which provide meaningful measures of non-Gaussianity. We prove that the filling cone generated by any two rays in this space is flat, ensuring that angles, projections, and inner products are rigorously well-defined. This geometric viewpoint recasts Gaussian approximation as a projection problem onto the Gaussian cone and reveals that the commonly used moment-matching Gaussian can \emph{not} be the \(W_2\)-nearest Gaussian for a given empirical distribution. In one dimension, we derive closed-form expressions for the proposed quantities and extend them to several classical distribution families, including uniform, Laplace, and logistic distributions; while in high dimensions, we develop an efficient stochastic manifold optimization algorithm based on a semi-discrete dual formulation. Experiments on synthetic data and real-world feature distributions demonstrate that the relative Wasserstein angle is more robust than the Wasserstein distance and that the proposed nearest Gaussian provides a better approximation than moment matching in the evaluation of Fréchet Inception Distance (FID) scores.