LGJan 18, 2021

A simple geometric proof for the benefit of depth in ReLU networks

arXiv:2101.07126v12 citations
Originality Synthesis-oriented
AI Analysis

This provides a foundational insight into neural network design, though it is incremental as it builds on existing depth separation results with simpler proofs.

The paper tackles the problem of proving the benefit of depth in ReLU networks by showing that for a sequence of classification problems, shallow networks require exponentially many parameters while deep networks with linear depth and constant width achieve zero error.

We present a simple proof for the benefit of depth in multi-layer feedforward network with rectified activation ("depth separation"). Specifically we present a sequence of classification problems indexed by $m$ such that (a) for any fixed depth rectified network there exist an $m$ above which classifying problem $m$ correctly requires exponential number of parameters (in $m$); and (b) for any problem in the sequence, we present a concrete neural network with linear depth (in $m$) and small constant width ($\leq 4$) that classifies the problem with zero error. The constructive proof is based on geometric arguments and a space folding construction. While stronger bounds and results exist, our proof uses substantially simpler tools and techniques, and should be accessible to undergraduate students in computer science and people with similar backgrounds.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes