MLAILGMar 16, 2012

Learning Feature Hierarchies with Centered Deep Boltzmann Machines

arXiv:1203.3783v19 citations
Originality Incremental advance
AI Analysis

This work addresses a key bottleneck in training deep Boltzmann machines for researchers in unsupervised learning and hierarchical feature extraction, though it is incremental as it modifies an existing algorithm rather than introducing a new paradigm.

The paper tackled the difficulty of training deep Boltzmann machines jointly by proposing a modification that recenters activation outputs to zero, which improved the Hessian conditioning and facilitated learning. The resulting centered deep Boltzmann machine successfully learned hierarchical representations and achieved better generative modeling on real data.

Deep Boltzmann machines are in principle powerful models for extracting the hierarchical structure of data. Unfortunately, attempts to train layers jointly (without greedy layer-wise pretraining) have been largely unsuccessful. We propose a modification of the learning algorithm that initially recenters the output of the activation functions to zero. This modification leads to a better conditioned Hessian and thus makes learning easier. We test the algorithm on real data and demonstrate that our suggestion, the centered deep Boltzmann machine, learns a hierarchy of increasingly abstract representations and a better generative model of data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes