Training Restricted Boltzmann Machine by Perturbation
This work addresses the challenge of learning models that lack efficient MAP estimation, offering a novel approach for training RBMs with potential applications in machine learning domains requiring fast and regularized feature extraction.
The paper tackles the problem of maximum likelihood learning for discrete graphical models, specifically Restricted Boltzmann Machines (RBMs), by introducing a Perturb and Descend (PD) method that leverages training data to produce samples efficiently through linear calculations and thresholding, resulting in robust features and sparse hidden layer activation.
A new approach to maximum likelihood learning of discrete graphical models and RBM in particular is introduced. Our method, Perturb and Descend (PD) is inspired by two ideas (I) perturb and MAP method for sampling (II) learning by Contrastive Divergence minimization. In contrast to perturb and MAP, PD leverages training data to learn the models that do not allow efficient MAP estimation. During the learning, to produce a sample from the current model, we start from a training data and descend in the energy landscape of the "perturbed model", for a fixed number of steps, or until a local optima is reached. For RBM, this involves linear calculations and thresholding which can be very fast. Furthermore we show that the amount of perturbation is closely related to the temperature parameter and it can regularize the model by producing robust features resulting in sparse hidden layer activation.