MLLGMEApr 19, 2025

Learning over von Mises-Fisher Distributions via a Wasserstein-like Geometry

arXiv:2504.14164v12 citationsh-index: 7Stat comput
Originality Incremental advance
AI Analysis

This work provides a principled tool for directional data analysis, addressing a known bottleneck in probabilistic learning for spherical data, though it is incremental in extending optimal transport concepts to a specific distribution family.

The authors tackled the problem of comparing von Mises-Fisher distributions for directional data by introducing a novel Wasserstein-like distance metric that decomposes discrepancies into angular and concentration components, achieving tractable closed-form expressions and demonstrating effectiveness in applications like mixture reduction on synthetic and real-world datasets.

We introduce a novel, geometry-aware distance metric for the family of von Mises-Fisher (vMF) distributions, which are fundamental models for directional data on the unit hypersphere. Although the vMF distribution is widely employed in a variety of probabilistic learning tasks involving spherical data, principled tools for comparing vMF distributions remain limited, primarily due to the intractability of normalization constants and the absence of suitable geometric metrics. Motivated by the theory of optimal transport, we propose a Wasserstein-like distance that decomposes the discrepancy between two vMF distributions into two interpretable components: a geodesic term capturing the angular separation between mean directions, and a variance-like term quantifying differences in concentration parameters. The derivation leverages a Gaussian approximation in the high-concentration regime to yield a tractable, closed-form expression that respects the intrinsic spherical geometry. We show that the proposed distance exhibits desirable theoretical properties and induces a latent geometric structure on the space of non-degenerate vMF distributions. As a primary application, we develop the efficient algorithms for vMF mixture reduction, enabling structure-preserving compression of mixture models in high-dimensional settings. Empirical results on synthetic datasets and real-world high-dimensional embeddings, including biomedical sentence representations and deep visual features, demonstrate the effectiveness of the proposed geometry in distinguishing distributions and supporting interpretable inference. This work expands the statistical toolbox for directional data analysis by introducing a tractable, transport-inspired distance tailored to the geometry of the hypersphere.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes