LG MLOct 3, 2018

Understanding Weight Normalized Deep Neural Networks with Rectified Linear Units

arXiv:1810.01877v34.712 citations

Originality Incremental advance

AI Analysis

This provides theoretical insights for researchers in machine learning on norm-based regularization, but it is incremental as it builds on existing normalization frameworks.

The paper tackles the problem of capacity control and generalization in deep neural networks by analyzing $L_{p,q}$ weight normalization, establishing upper bounds on Rademacher complexities and showing that approximation and generalization errors depend on depth by a square root term.

This paper presents a general framework for norm-based capacity control for $L_{p,q}$ weight normalized deep neural networks. We establish the upper bound on the Rademacher complexities of this family. With an $L_{p,q}$ normalization where $q\le p^*$, and $1/p+1/p^{*}=1$, we discuss properties of a width-independent capacity control, which only depends on depth by a square root term. We further analyze the approximation properties of $L_{p,q}$ weight normalized deep neural networks. In particular, for an $L_{1,\infty}$ weight normalized network, the approximation error can be controlled by the $L_1$ norm of the output layer, and the corresponding generalization error only depends on the architecture by the square root of the depth.

View on arXiv PDF

Similar