Belief Propagation in Conditional RBMs for Structured Prediction
This work improves structured prediction for applications using CRBMs by showing BP can be efficient and effective, though it is incremental as it builds on existing methods.
The authors tackled the problem of structured prediction using conditional Restricted Boltzmann Machines (CRBMs) by implementing a scalable matrix-based belief propagation (BP) algorithm, demonstrating that it significantly outperforms state-of-the-art contrastive divergence (CD) methods in both maximum likelihood and max-margin learning.
Restricted Boltzmann machines~(RBMs) and conditional RBMs~(CRBMs) are popular models for a wide range of applications. In previous work, learning on such models has been dominated by contrastive divergence~(CD) and its variants. Belief propagation~(BP) algorithms are believed to be slow for structured prediction on conditional RBMs~(e.g., Mnih et al. [2011]), and not as good as CD when applied in learning~(e.g., Larochelle et al. [2012]). In this work, we present a matrix-based implementation of belief propagation algorithms on CRBMs, which is easily scalable to tens of thousands of visible and hidden units. We demonstrate that, in both maximum likelihood and max-margin learning, training conditional RBMs with BP as the inference routine can provide significantly better results than current state-of-the-art CD methods on structured prediction problems. We also include practical guidelines on training CRBMs with BP, and some insights on the interaction of learning and inference algorithms for CRBMs.