MLLGMar 8, 2019

Random Matrix-Improved Estimation of the Wasserstein Distance between two Centered Gaussian Distributions

arXiv:1903.03447v14 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of accurate covariance estimation in high-dimensional statistics, offering a method that improves upon state-of-the-art alternatives, though it is incremental in nature.

The paper tackles the problem of estimating functionals of eigenvalues from the product of two covariance matrices, particularly for the Wasserstein distance between centered Gaussian distributions, and shows that the proposed estimator significantly outperforms the classical plug-in estimator with concrete performance gains.

This article proposes a method to consistently estimate functionals $\frac1p\sum_{i=1}^pf(λ_i(C_1C_2))$ of the eigenvalues of the product of two covariance matrices $C_1,C_2\in\mathbb{R}^{p\times p}$ based on the empirical estimates $λ_i(\hat C_1\hat C_2)$ ($\hat C_a=\frac1{n_a}\sum_{i=1}^{n_a} x_i^{(a)}x_i^{(a){\sf T}}$), when the size $p$ and number $n_a$ of the (zero mean) samples $x_i^{(a)}$ are similar. As a corollary, a consistent estimate of the Wasserstein distance (related to the case $f(t)=\sqrt{t}$) between centered Gaussian distributions is derived. The new estimate is shown to largely outperform the classical sample covariance-based `plug-in' estimator. Based on this finding, a practical application to covariance estimation is then devised which demonstrates potentially significant performance gains with respect to state-of-the-art alternatives.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes