ML LG NEDec 6, 2016

A Probabilistic Framework for Deep Learning

Ankit B. Patel, Tan Nguyen, Richard G. Baraniuk

arXiv:1612.01936v119.669 citations

Originality Highly original

AI Analysis

This work addresses the need for a principled probabilistic framework in deep learning, offering insights and improvements for researchers and practitioners, though it is incremental in building upon existing DCN concepts.

The authors tackled the problem of providing a probabilistic foundation for deep convolutional neural networks (DCNs) by developing the Deep Rendering Mixture Model (DRMM), which reproduces DCN operations through max-sum inference and offers training via EM as an alternative to back-propagation. They demonstrated that DRMM-based classification outperforms DCNs in supervised digit classification, training 2-3x faster with similar accuracy, and achieves state-of-the-art results on MNIST and comparable performance on CIFAR10 for semi-supervised and unsupervised tasks.

We develop a probabilistic framework for deep learning based on the Deep Rendering Mixture Model (DRMM), a new generative probabilistic model that explicitly capture variations in data due to latent task nuisance variables. We demonstrate that max-sum inference in the DRMM yields an algorithm that exactly reproduces the operations in deep convolutional neural networks (DCNs), providing a first principles derivation. Our framework provides new insights into the successes and shortcomings of DCNs as well as a principled route to their improvement. DRMM training via the Expectation-Maximization (EM) algorithm is a powerful alternative to DCN back-propagation, and initial training results are promising. Classification based on the DRMM and other variants outperforms DCNs in supervised digit classification, training 2-3x faster while achieving similar accuracy. Moreover, the DRMM is applicable to semi-supervised and unsupervised learning tasks, achieving results that are state-of-the-art in several categories on the MNIST benchmark and comparable to state of the art on the CIFAR10 benchmark.

View on arXiv PDF

Similar