LGMLFeb 15, 2019

Efficient Deep Learning of GMMs

arXiv:1902.05707v12 citations
Originality Highly original
AI Analysis

This provides a theoretical explanation for the efficiency of deep neural networks in practical classification problems like speech, image, and text, though it is incremental as it builds on known universality of Gaussian distributions.

The paper tackles the problem of classifying Gaussian mixture models efficiently, showing that deep neural networks with two hidden layers require only O(n) neurons for optimal classification, while shallow networks need at least O(exp(n)) neurons or exponentially large coefficients.

We show that a collection of Gaussian mixture models (GMMs) in $R^{n}$ can be optimally classified using $O(n)$ neurons in a neural network with two hidden layers (deep neural network), whereas in contrast, a neural network with a single hidden layer (shallow neural network) would require at least $O(\exp(n))$ neurons or possibly exponentially large coefficients. Given the universality of the Gaussian distribution in the feature spaces of data, e.g., in speech, image and text, our result sheds light on the observed efficiency of deep neural networks in practical classification problems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes