LGCVMLDec 6, 2019

Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One

arXiv:1912.03263v3653 citations
Originality Highly original
AI Analysis

This work addresses the problem of unifying generative and discriminative learning for AI practitioners, offering a hybrid model that achieves state-of-the-art performance in both tasks.

The paper reinterprets standard discriminative classifiers as energy-based models for joint distributions, enabling training on unlabeled data and improving calibration, robustness, and out-of-distribution detection, with sample generation quality rivaling GANs.

We propose to reinterpret a standard discriminative classifier of p(y|x) as an energy based model for the joint distribution p(x,y). In this setting, the standard class probabilities can be easily computed as well as unnormalized values of p(x) and p(x|y). Within this framework, standard discriminative architectures may beused and the model can also be trained on unlabeled data. We demonstrate that energy based training of the joint distribution improves calibration, robustness, andout-of-distribution detection while also enabling our models to generate samplesrivaling the quality of recent GAN approaches. We improve upon recently proposed techniques for scaling up the training of energy based models and presentan approach which adds little overhead compared to standard classification training. Our approach is the first to achieve performance rivaling the state-of-the-artin both generative and discriminative learning within one hybrid model.

Code Implementations5 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes