LGMLNov 22, 2021

Density Ratio Estimation via Infinitesimal Classification

arXiv:2111.11010v264 citations
Originality Highly original
AI Analysis

This addresses a fundamental bottleneck in machine learning for comparing probability distributions, offering a novel approach with practical improvements in high-dimensional scenarios.

The paper tackles the challenge of density ratio estimation in high-dimensional settings by proposing DRE-∞, which reduces the problem to easier subproblems using an infinite continuum of bridge distributions and a novel time score matching objective. The method shows strong performance on downstream tasks like mutual information estimation and energy-based modeling with complex datasets.

Density ratio estimation (DRE) is a fundamental machine learning technique for comparing two probability distributions. However, existing methods struggle in high-dimensional settings, as it is difficult to accurately compare probability distributions based on finite samples. In this work we propose DRE-\infty, a divide-and-conquer approach to reduce DRE to a series of easier subproblems. Inspired by Monte Carlo methods, we smoothly interpolate between the two distributions via an infinite continuum of intermediate bridge distributions. We then estimate the instantaneous rate of change of the bridge distributions indexed by time (the "time score") -- a quantity defined analogously to data (Stein) scores -- with a novel time score matching objective. Crucially, the learned time scores can then be integrated to compute the desired density ratio. In addition, we show that traditional (Stein) scores can be used to obtain integration paths that connect regions of high density in both distributions, improving performance in practice. Empirically, we demonstrate that our approach performs well on downstream tasks such as mutual information estimation and energy-based modeling on complex, high-dimensional datasets.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes