CVLGApr 6, 2019

Iterative Normalization: Beyond Standardization towards Efficient Whitening

arXiv:1904.03441v1185 citations
Originality Incremental advance
AI Analysis

This addresses a bottleneck in neural network normalization for researchers and practitioners by providing a more efficient whitening method, though it is incremental over existing approaches.

The paper tackled the inefficiency of Decorrelated Batch Normalization (DBN) by proposing Iterative Normalization (IterNorm), which uses Newton's iterations for efficient whitening without eigen-decomposition, achieving consistently improved performance on CIFAR-10 and ImageNet over BN and DBN.

Batch Normalization (BN) is ubiquitously employed for accelerating neural network training and improving the generalization capability by performing standardization within mini-batches. Decorrelated Batch Normalization (DBN) further boosts the above effectiveness by whitening. However, DBN relies heavily on either a large batch size, or eigen-decomposition that suffers from poor efficiency on GPUs. We propose Iterative Normalization (IterNorm), which employs Newton's iterations for much more efficient whitening, while simultaneously avoiding the eigen-decomposition. Furthermore, we develop a comprehensive study to show IterNorm has better trade-off between optimization and generalization, with theoretical and experimental support. To this end, we exclusively introduce Stochastic Normalization Disturbance (SND), which measures the inherent stochastic uncertainty of samples when applied to normalization operations. With the support of SND, we provide natural explanations to several phenomena from the perspective of optimization, e.g., why group-wise whitening of DBN generally outperforms full-whitening and why the accuracy of BN degenerates with reduced batch sizes. We demonstrate the consistently improved performance of IterNorm with extensive experiments on CIFAR-10 and ImageNet over BN and DBN.

Code Implementations5 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes