LGNEMLDec 12, 2015

The Power of Depth for Feedforward Neural Networks

arXiv:1512.03965v4792 citations
Originality Highly original
AI Analysis

This work provides a foundational theoretical insight for the machine learning community, showing that increasing depth by even one layer can lead to exponential gains in representational power, which is crucial for designing efficient neural network architectures.

The authors tackled the problem of understanding the relative importance of depth versus width in feedforward neural networks by showing that a simple 3-layer network can represent a function that requires exponential width in a 2-layer network to approximate with constant accuracy, demonstrating that depth can be exponentially more valuable than width.

We show that there is a simple (approximately radial) function on $\reals^d$, expressible by a small 3-layer feedforward neural networks, which cannot be approximated by any 2-layer network, to more than a certain constant accuracy, unless its width is exponential in the dimension. The result holds for virtually all known activation functions, including rectified linear units, sigmoids and thresholds, and formally demonstrates that depth -- even if increased by 1 -- can be exponentially more valuable than width for standard feedforward neural networks. Moreover, compared to related results in the context of Boolean functions, our result requires fewer assumptions, and the proof techniques and construction are very different.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes