Self-regularizing restricted Boltzmann machines
This work addresses a specific bottleneck in energy-based models for researchers in machine learning, offering an incremental improvement through a novel ensemble-based approach.
The authors tackled the problem of determining the optimal number of hidden units in restricted Boltzmann machines by proposing a grand-canonical extension that self-regulates this number, achieving exceedingly small generalization error in learning tasks on Ising theory and MNIST data.
Focusing on the grand-canonical extension of the ordinary restricted Boltzmann machine, we suggest an energy-based model for feature extraction that uses a layer of hidden units with varying size. By an appropriate choice of the chemical potential and given a sufficiently large number of hidden resources the generative model is able to efficiently deduce the optimal number of hidden units required to learn the target data with exceedingly small generalization error. The formal simplicity of the grand-canonical ensemble combined with a rapidly converging ansatz in mean-field theory enable us to recycle well-established numerical algothhtims during training, like contrastive divergence, with only minor changes. As a proof of principle and to demonstrate the novel features of grand-canonical Boltzmann machines, we train our generative models on data from the Ising theory and MNIST.