LGDec 23, 2021

A Robust Initialization of Residual Blocks for Effective ResNet Training without Batch Normalization

arXiv:2112.12299v21 citations
AI Analysis

This addresses the practical issues of Batch Normalization for researchers and practitioners in deep learning, though it is incremental as it builds on existing normalization-free architectures.

The paper tackled the problem of training ResNet-like networks without Batch Normalization by proposing a modified initialization for residual blocks, achieving competitive results on CIFAR-10, CIFAR-100, and ImageNet datasets.

Batch Normalization is an essential component of all state-of-the-art neural networks architectures. However, since it introduces many practical issues, much recent research has been devoted to designing normalization-free architectures. In this paper, we show that weights initialization is key to train ResNet-like normalization-free networks. In particular, we propose a slight modification to the summation operation of a block output to the skip-connection branch, so that the whole network is correctly initialized. We show that this modified architecture achieves competitive results on CIFAR-10, CIFAR-100 and ImageNet without further regularization nor algorithmic modifications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes