MLLGFeb 13, 2020

Learning the Stein Discrepancy for Training and Evaluating Energy-Based Models without Sampling

arXiv:2002.05616v451 citations
AI Analysis

This provides a scalable method for researchers and practitioners working with energy-based models, though it is incremental as it builds on Stein discrepancy concepts.

The paper tackles the problem of evaluating and training unnormalized density models without sampling by estimating the Stein discrepancy using gradients, resulting in a goodness-of-fit test that outperforms existing methods on high-dimensional data and a scalable training approach.

We present a new method for evaluating and training unnormalized density models. Our approach only requires access to the gradient of the unnormalized model's log-density. We estimate the Stein discrepancy between the data density $p(x)$ and the model density $q(x)$ defined by a vector function of the data. We parameterize this function with a neural network and fit its parameters to maximize the discrepancy. This yields a novel goodness-of-fit test which outperforms existing methods on high dimensional data. Furthermore, optimizing $q(x)$ to minimize this discrepancy produces a novel method for training unnormalized models which scales more gracefully than existing methods. The ability to both learn and compare models is a unique feature of the proposed method.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes