LGMLAug 26, 2019

A Probabilistic Representation of Deep Learning

arXiv:1908.09772v11 citations
AI Analysis

This work provides a theoretical explanation for deep learning properties like hierarchy and generalization, addressing a foundational problem for researchers in machine learning, though it is incremental as it builds on existing probabilistic frameworks.

The paper tackles the problem of explaining deep neural networks by introducing a probabilistic representation that interprets neurons, hidden layers, and the whole architecture in terms of Gibbs distributions and Bayesian neural networks, and it validates this representation with simulation results on a synthetic dataset.

In this work, we introduce a novel probabilistic representation of deep learning, which provides an explicit explanation for the Deep Neural Networks (DNNs) in three aspects: (i) neurons define the energy of a Gibbs distribution; (ii) the hidden layers of DNNs formulate Gibbs distributions; and (iii) the whole architecture of DNNs can be interpreted as a Bayesian neural network. Based on the proposed probabilistic representation, we investigate two fundamental properties of deep learning: hierarchy and generalization. First, we explicitly formulate the hierarchy property from the Bayesian perspective, namely that some hidden layers formulate a prior distribution and the remaining layers formulate a likelihood distribution. Second, we demonstrate that DNNs have an explicit regularization by learning a prior distribution and the learning algorithm is one reason for decreasing the generalization ability of DNNs. Moreover, we clarify two empirical phenomena of DNNs that cannot be explained by traditional theories of generalization. Simulation results validate the proposed probabilistic representation and the insights into these properties of deep learning based on a synthetic dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes