CVApr 5, 2018

Learning Strict Identity Mappings in Deep Residual Networks

arXiv:1804.01661v514 citations
Originality Incremental advance
AI Analysis

This addresses the inefficiency of super deep networks for practitioners by reducing computational resources without sacrificing accuracy, though it is incremental as it builds on existing ResNet architectures.

The paper tackles the problem of redundant layers in deep residual networks, which are often used for marginal performance gains at high computational cost, by proposing epsilon-ResNet to automatically discard layers with responses below a threshold, achieving up to 80% parameter reduction with minimal performance loss.

A family of super deep networks, referred to as residual networks or ResNet, achieved record-beating performance in various visual tasks such as image recognition, object detection, and semantic segmentation. The ability to train very deep networks naturally pushed the researchers to use enormous resources to achieve the best performance. Consequently, in many applications super deep residual networks were employed for just a marginal improvement in performance. In this paper, we propose epsilon-ResNet that allows us to automatically discard redundant layers, which produces responses that are smaller than a threshold epsilon, with a marginal or no loss in performance. The epsilon-ResNet architecture can be achieved using a few additional rectified linear units in the original ResNet. Our method does not use any additional variables nor numerous trials like other hyper-parameter optimization techniques. The layer selection is achieved using a single training process and the evaluation is performed on CIFAR-10, CIFAR-100, SVHN, and ImageNet datasets. In some instances, we achieve about 80% reduction in the number of parameters.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes