CVLGJan 15, 2021

Dynamic Normalization

arXiv:2101.06073v11 citations
Originality Incremental advance
AI Analysis

This work addresses the need for more robust normalization in CNNs, particularly for tasks like classification and detection, though it appears incremental as an extension of BN.

The authors tackled the limitations of Batch Normalization by proposing Dynamic Normalization (DN-B), which adaptively generates scale and shift parameters per channel and sample, improving robustness and performance. In experiments, DN-B increased MobileNetV2 accuracy on ImageNet-100 by over 2% with minimal computational overhead and boosted SSDLite mAP on MS-COCO by nearly 4%.

Batch Normalization has become one of the essential components in CNN. It allows the network to use a higher learning rate and speed up training. And the network doesn't need to be initialized carefully. However, in our work, we find that a simple extension of BN can increase the performance of the network. First, we extend BN to adaptively generate scale and shift parameters for each mini-batch data, called DN-C (Batch-shared and Channel-wise). We use the statistical characteristics of mini-batch data ($E[X], Std[X]\in\mathbb{R}^{c}$) as the input of SC module. Then we extend BN to adaptively generate scale and shift parameters for each channel of each sample, called DN-B (Batch and Channel-wise). Our experiments show that DN-C model can't train normally, but DN-B model has very good robustness. In classification task, DN-B can improve the accuracy of the MobileNetV2 on ImageNet-100 more than 2% with only 0.6% additional Mult-Adds. In detection task, DN-B can improve the accuracy of the SSDLite on MS-COCO nearly 4% mAP with the same settings. Compared with BN, DN-B has stable performance when using higher learning rate or smaller batch size.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes