CVLGFeb 28, 2017

ShaResNet: reducing residual network parameter number by sharing weights

arXiv:1702.08782v229 citations
Originality Incremental advance
AI Analysis

This addresses the prohibitive memory and depth costs in image classification for practitioners, but it is incremental as it builds on existing residual network architectures.

The paper tackles the problem of high parameter count in deep residual networks by proposing ShaResNet, which shares convolutional layer weights between residual blocks at the same spatial scale, reducing a 152-layer network to 106 layers with a 39% parameter gain and less than 0.2% accuracy loss on ImageNet.

Deep Residual Networks have reached the state of the art in many image processing tasks such image classification. However, the cost for a gain in accuracy in terms of depth and memory is prohibitive as it requires a higher number of residual blocks, up to double the initial value. To tackle this problem, we propose in this paper a way to reduce the redundant information of the networks. We share the weights of convolutional layers between residual blocks operating at the same spatial scale. The signal flows multiple times in the same convolutional layer. The resulting architecture, called ShaResNet, contains block specific layers and shared layers. These ShaResNet are trained exactly in the same fashion as the commonly used residual networks. We show, on the one hand, that they are almost as efficient as their sequential counterparts while involving less parameters, and on the other hand that they are more efficient than a residual network with the same number of parameters. For example, a 152-layer-deep residual network can be reduced to 106 convolutional layers, i.e. a parameter gain of 39\%, while loosing less than 0.2\% accuracy on ImageNet.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes