CVApr 14, 2016

Deep Residual Networks with Exponential Linear Unit

arXiv:1604.04112v4129 citations
Originality Synthesis-oriented
AI Analysis

This work addresses performance issues in deep learning models for computer vision tasks, but it is incremental as it builds on existing Residual Networks.

The paper tackles the problems of vanishing gradients and degradation in very deep convolutional neural networks by replacing ReLU and Batch Normalization with Exponential Linear Units in Residual Networks, resulting in faster learning and improved accuracy on datasets like CIFAR-10 and CIFAR-100.

Very deep convolutional neural networks introduced new problems like vanishing gradient and degradation. The recent successful contributions towards solving these problems are Residual and Highway Networks. These networks introduce skip connections that allow the information (from the input or those learned in earlier layers) to flow more into the deeper layers. These very deep models have lead to a considerable decrease in test errors, on benchmarks like ImageNet and COCO. In this paper, we propose the use of exponential linear unit instead of the combination of ReLU and Batch Normalization in Residual Networks. We show that this not only speeds up learning in Residual Networks but also improves the accuracy as the depth increases. It improves the test error on almost all data sets, like CIFAR-10 and CIFAR-100

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes