Advanced Mean Field Theory of Restricted Boltzmann Machine
This work addresses a computational bottleneck for researchers and practitioners using restricted Boltzmann machines in machine learning, though it appears incremental as it builds on existing mean field theory approaches.
The authors tackled the difficulty of computing gradients for learning in restricted Boltzmann machines by developing an advanced mean field theory based on the Bethe approximation, resulting in an efficient message-passing method that evaluates partition functions and gradients without statistical sampling, as compared to expensive sampling-based methods.
Learning in restricted Boltzmann machine is typically hard due to the computation of gradients of log-likelihood function. To describe the network state statistics of the restricted Boltzmann machine, we develop an advanced mean field theory based on the Bethe approximation. Our theory provides an efficient message passing based method that evaluates not only the partition function (free energy) but also its gradients without requiring statistical sampling. The results are compared with those obtained by the computationally expensive sampling based method.