MLLGSTApr 9, 2020

Mehler's Formula, Branching Process, and Compositional Kernels of Deep Neural Networks

arXiv:2004.04767v212 citations
AI Analysis

This provides theoretical insights into neural network behavior for researchers in machine learning theory, though it appears incremental in extending existing kernel methods.

The paper tackles the mathematical analysis of deep neural networks by connecting compositional kernels to branching processes via Mehler's formula, providing explicit formulas for kernel eigenvalues to quantify complexity and proposing a random features algorithm with a new activation function for layer compression.

We utilize a connection between compositional kernels and branching processes via Mehler's formula to study deep neural networks. This new probabilistic insight provides us a novel perspective on the mathematical role of activation functions in compositional neural networks. We study the unscaled and rescaled limits of the compositional kernels and explore the different phases of the limiting behavior, as the compositional depth increases. We investigate the memorization capacity of the compositional kernels and neural networks by characterizing the interplay among compositional depth, sample size, dimensionality, and non-linearity of the activation. Explicit formulas on the eigenvalues of the compositional kernel are provided, which quantify the complexity of the corresponding reproducing kernel Hilbert space. On the methodological front, we propose a new random features algorithm, which compresses the compositional layers by devising a new activation function.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes