ML LGFeb 13, 2020

Learning the Stein Discrepancy for Training and Evaluating Energy-Based Models without Sampling

Will Grathwohl, Kuan-Chieh Wang, Jorn-Henrik Jacobsen, David Duvenaud, Richard Zemel

arXiv:2002.05616v428.651 citationsHas Code

Originality Highly original

AI Analysis

This provides a scalable method for researchers and practitioners working with energy-based models, though it is incremental as it builds on Stein discrepancy concepts.

The paper tackles the problem of evaluating and training unnormalized density models without sampling by estimating the Stein discrepancy using gradients, resulting in a goodness-of-fit test that outperforms existing methods on high-dimensional data and a scalable training approach.

We present a new method for evaluating and training unnormalized density models. Our approach only requires access to the gradient of the unnormalized model's log-density. We estimate the Stein discrepancy between the data density $p(x)$ and the model density $q(x)$ defined by a vector function of the data. We parameterize this function with a neural network and fit its parameters to maximize the discrepancy. This yields a novel goodness-of-fit test which outperforms existing methods on high dimensional data. Furthermore, optimizing $q(x)$ to minimize this discrepancy produces a novel method for training unnormalized models which scales more gracefully than existing methods. The ability to both learn and compare models is a unique feature of the proposed method.

View on arXiv PDF Code

Similar