NE LG MLMay 11, 2015

Soft-Deep Boltzmann Machines

arXiv:1505.02462v3

AI Analysis

This addresses a foundational problem in machine learning by challenging the assumed superiority of DBMs and offering a more efficient architecture for generative modeling.

The paper tackles the limited representational power of deep Boltzmann machines (DBMs) in exploiting distributed representations, showing that DBMs can be inefficient and proposing soft-deep BMs (sDBMs) which outperform state-of-the-art models like DBMs on generative tasks for binarized MNIST and Caltech-101 silhouettes.

We present a layered Boltzmann machine (BM) that can better exploit the advantages of a distributed representation. It is widely believed that deep BMs (DBMs) have far greater representational power than its shallow counterpart, restricted Boltzmann machines (RBMs). However, this expectation on the supremacy of DBMs over RBMs has not ever been validated in a theoretical fashion. In this paper, we provide both theoretical and empirical evidences that the representational power of DBMs can be actually rather limited in taking advantages of distributed representations. We propose an approximate measure for the representational power of a BM regarding to the efficiency of a distributed representation. With this measure, we show a surprising fact that DBMs can make inefficient use of distributed representations. Based on these observations, we propose an alternative BM architecture, which we dub soft-deep BMs (sDBMs). We show that sDBMs can more efficiently exploit the distributed representations in terms of the measure. Experiments demonstrate that sDBMs outperform several state-of-the-art models, including DBMs, in generative tasks on binarized MNIST and Caltech-101 silhouettes.

View on arXiv PDF

Similar