LGCVITJan 14, 2017

On Hölder projective divergences

arXiv:1701.03916v127 citations
Originality Incremental advance
AI Analysis

This provides new mathematical tools for measuring distribution distances in machine learning, particularly for unnormalized data, though it appears incremental as an extension of existing divergence concepts.

The authors introduced Hölder divergences and pseudo-divergences as new statistical distances based on Hölder inequalities, which generalize the Cauchy-Schwarz divergence and work with unnormalized distributions. They derived closed-form expressions for exponential families and demonstrated in clustering experiments that symmetrized Hölder divergences outperform the symmetric Cauchy-Schwarz divergence on Gaussian distributions.

We describe a framework to build distances by measuring the tightness of inequalities, and introduce the notion of proper statistical divergences and improper pseudo-divergences. We then consider the Hölder ordinary and reverse inequalities, and present two novel classes of Hölder divergences and pseudo-divergences that both encapsulate the special case of the Cauchy-Schwarz divergence. We report closed-form formulas for those statistical dissimilarities when considering distributions belonging to the same exponential family provided that the natural parameter space is a cone (e.g., multivariate Gaussians), or affine (e.g., categorical distributions). Those new classes of Hölder distances are invariant to rescaling, and thus do not require distributions to be normalized. Finally, we show how to compute statistical Hölder centroids with respect to those divergences, and carry out center-based clustering toy experiments on a set of Gaussian distributions that demonstrate empirically that symmetrized Hölder divergences outperform the symmetric Cauchy-Schwarz divergence.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes