Stochastic Discriminative EM
This work addresses the challenge of combining discriminative training with generative models for researchers in machine learning, offering a principled approach to handle missing data and latent variables, though it appears incremental as it builds on existing EM and gradient descent methods.
The paper tackles the problem of discriminative training for probabilistic generative models by introducing stochastic discriminative EM (sdEM), an online-EM algorithm that minimizes discriminative loss functions like negative conditional log-likelihood and Hinge loss, resulting in generative models that handle missing data and latent variables, with performance demonstrated on text classification tasks using multinomial naive Bayes and latent Dirichlet allocation.
Stochastic discriminative EM (sdEM) is an online-EM-type algorithm for discriminative training of probabilistic generative models belonging to the exponential family. In this work, we introduce and justify this algorithm as a stochastic natural gradient descent method, i.e. a method which accounts for the information geometry in the parameter space of the statistical model. We show how this learning algorithm can be used to train probabilistic generative models by minimizing different discriminative loss functions, such as the negative conditional log-likelihood and the Hinge loss. The resulting models trained by sdEM are always generative (i.e. they define a joint probability distribution) and, in consequence, allows to deal with missing data and latent variables in a principled way either when being learned or when making predictions. The performance of this method is illustrated by several text classification problems for which a multinomial naive Bayes and a latent Dirichlet allocation based classifier are learned using different discriminative loss functions.