From Boltzmann Machines to Neural Networks and Back Again
This work addresses the challenge of learning latent variable models for researchers in machine learning, offering theoretical advancements and algorithmic improvements, though it is incremental in building on existing connections between RBMs and neural networks.
The paper tackles the difficulty of learning graphical models with latent variables, specifically Restricted Boltzmann Machines (RBMs), by establishing connections to learning two-layer neural networks and providing nearly optimal results under hardness assumptions, while also introducing and efficiently learning supervised RBMs with improved runtime compared to networks without distributional assumptions.
Graphical models are powerful tools for modeling high-dimensional data, but learning graphical models in the presence of latent variables is well-known to be difficult. In this work we give new results for learning Restricted Boltzmann Machines, probably the most well-studied class of latent variable models. Our results are based on new connections to learning two-layer neural networks under $\ell_{\infty}$ bounded input; for both problems, we give nearly optimal results under the conjectured hardness of sparse parity with noise. Using the connection between RBMs and feedforward networks, we also initiate the theoretical study of $supervised~RBMs$ [Hinton, 2012], a version of neural-network learning that couples distributional assumptions induced from the underlying graphical model with the architecture of the unknown function class. We then give an algorithm for learning a natural class of supervised RBMs with better runtime than what is possible for its related class of networks without distributional assumptions.