ML LGJan 30

Approximating $f$-Divergences with Rank Statistics

arXiv:2601.22784v11.7h-index: 2

Originality Incremental advance

AI Analysis

This provides a method for divergence estimation that avoids density-ratio estimation, which is useful for machine learning practitioners dealing with distribution comparisons, though it appears incremental as it builds on existing rank-based and sliced techniques.

The paper tackles the problem of estimating f-divergences between distributions without explicit density-ratio estimation by using rank statistics, proving monotonicity, lower bounds, and convergence rates, and validating it empirically against neural baselines and in generative modeling.

We introduce a rank-statistic approximation of $f$-divergences that avoids explicit density-ratio estimation by working directly with the distribution of ranks. For a resolution parameter $K$, we map the mismatch between two univariate distributions $μ$ and $ν$ to a rank histogram on $\{ 0, \ldots, K\}$ and measure its deviation from uniformity via a discrete $f$-divergence, yielding a rank-statistic divergence estimator. We prove that the resulting estimator of the divergence is monotone in $K$, is always a lower bound of the true $f$-divergence, and we establish quantitative convergence rates for $K\to\infty$ under mild regularity of the quantile-domain density ratio. To handle high-dimensional data, we define the sliced rank-statistic $f$-divergence by averaging the univariate construction over random projections, and we provide convergence results for the sliced limit as well. We also derive finite-sample deviation bounds along with asymptotic normality results for the estimator. Finally, we empirically validate the approach by benchmarking against neural baselines and illustrating its use as a learning objective in generative modelling experiments.

View on arXiv PDF

Similar