LGMLJan 9, 2021

How to Train Your Energy-Based Models

arXiv:2101.03288v2332 citations
Originality Synthesis-oriented
AI Analysis

This tutorial addresses the challenge of training Energy-Based Models for researchers and practitioners interested in applying or researching these flexible probabilistic models.

This paper provides an introduction to modern training approaches for Energy-Based Models (EBMs), which are probabilistic models with an unknown normalizing constant that makes training difficult. It explains maximum likelihood training with Markov chain Monte Carlo (MCMC) and MCMC-free approaches like Score Matching (SM) and Noise Contrastive Estimation (NCE), highlighting their theoretical connections.

Energy-Based Models (EBMs), also known as non-normalized probabilistic models, specify probability density or mass functions up to an unknown normalizing constant. Unlike most other probabilistic models, EBMs do not place a restriction on the tractability of the normalizing constant, thus are more flexible to parameterize and can model a more expressive family of probability distributions. However, the unknown normalizing constant of EBMs makes training particularly difficult. Our goal is to provide a friendly introduction to modern approaches for EBM training. We start by explaining maximum likelihood training with Markov chain Monte Carlo (MCMC), and proceed to elaborate on MCMC-free approaches, including Score Matching (SM) and Noise Constrastive Estimation (NCE). We highlight theoretical connections among these three approaches, and end with a brief survey on alternative training methods, which are still under active research. Our tutorial is targeted at an audience with basic understanding of generative models who want to apply EBMs or start a research project in this direction.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes