LGMLOct 3, 2018

Understanding Weight Normalized Deep Neural Networks with Rectified Linear Units

arXiv:1810.01877v312 citations
Originality Incremental advance
AI Analysis

This provides theoretical insights for researchers in machine learning on norm-based regularization, but it is incremental as it builds on existing normalization frameworks.

The paper tackles the problem of capacity control and generalization in deep neural networks by analyzing $L_{p,q}$ weight normalization, establishing upper bounds on Rademacher complexities and showing that approximation and generalization errors depend on depth by a square root term.

This paper presents a general framework for norm-based capacity control for $L_{p,q}$ weight normalized deep neural networks. We establish the upper bound on the Rademacher complexities of this family. With an $L_{p,q}$ normalization where $q\le p^*$, and $1/p+1/p^{*}=1$, we discuss properties of a width-independent capacity control, which only depends on depth by a square root term. We further analyze the approximation properties of $L_{p,q}$ weight normalized deep neural networks. In particular, for an $L_{1,\infty}$ weight normalized network, the approximation error can be controlled by the $L_1$ norm of the output layer, and the corresponding generalization error only depends on the architecture by the square root of the depth.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes