CVLGMar 16, 2016

Identity Mappings in Deep Residual Networks

arXiv:1603.05027v311185 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses the challenge of training extremely deep neural networks for computer vision tasks, providing incremental improvements to residual network architectures.

The paper tackled the problem of training very deep residual networks by analyzing propagation formulations, which led to the proposal of a new residual unit using identity mappings and after-addition activation. This resulted in improved generalization, with a 1001-layer ResNet achieving 4.62% error on CIFAR-10 and better performance on CIFAR-100 and ImageNet.

Deep residual networks have emerged as a family of extremely deep architectures showing compelling accuracy and nice convergence behaviors. In this paper, we analyze the propagation formulations behind the residual building blocks, which suggest that the forward and backward signals can be directly propagated from one block to any other block, when using identity mappings as the skip connections and after-addition activation. A series of ablation experiments support the importance of these identity mappings. This motivates us to propose a new residual unit, which makes training easier and improves generalization. We report improved results using a 1001-layer ResNet on CIFAR-10 (4.62% error) and CIFAR-100, and a 200-layer ResNet on ImageNet. Code is available at: https://github.com/KaimingHe/resnet-1k-layers

Code Implementations54 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes