LGFeb 24, 2022

Clarifying MCMC-based training of modern EBMs : Contrastive Divergence versus Maximum Likelihood

arXiv:2202.12176v11.8

Originality Synthesis-oriented

AI Analysis

This work resolves theoretical misunderstandings in EBM training for researchers, but it is incremental as it clarifies existing methods rather than introducing new ones.

The paper addresses confusion in modern Energy-Based Models (EBMs) by clarifying theoretical errors in recent influential papers regarding Contrastive Divergence (CD) versus Maximum Likelihood training, and provides a reinterpretation with illustrative experiments.

The Energy-Based Model (EBM) framework is a very general approach to generative modeling that tries to learn and exploit probability distributions only defined though unnormalized scores. It has risen in popularity recently thanks to the impressive results obtained in image generation by parameterizing the distribution with Convolutional Neural Networks (CNN). However, the motivation and theoretical foundations behind modern EBMs are often absent from recent papers and this sometimes results in some confusion. In particular, the theoretical justifications behind the popular MCMC-based learning algorithm Contrastive Divergence (CD) are often glossed over and we find that this leads to theoretical errors in recent influential papers (Du & Mordatch, 2019; Du et al., 2020). After offering a first-principles introduction of MCMC-based training, we argue that the learning algorithm they use can in fact not be described as CD and reinterpret theirs methods in light of a new interpretation. Finally, we discuss the implications of our new interpretation and provide some illustrative experiments.

View on arXiv PDF

Similar