CVJan 18, 2018

Sparsely Aggregated Convolutional Networks

arXiv:1801.05895v327 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of training very deep networks more efficiently for computer vision tasks, offering a novel architectural improvement.

The paper tackles the problem of designing internal skip connections in deep convolutional neural networks by proposing a sparse aggregation pattern, which results in superior performance with fewer parameters and lower computational requirements, enabling robust scaling to over 1000 layers.

We explore a key architectural aspect of deep convolutional neural networks: the pattern of internal skip connections used to aggregate outputs of earlier layers for consumption by deeper layers. Such aggregation is critical to facilitate training of very deep networks in an end-to-end manner. This is a primary reason for the widespread adoption of residual networks, which aggregate outputs via cumulative summation. While subsequent works investigate alternative aggregation operations (e.g. concatenation), we focus on an orthogonal question: which outputs to aggregate at a particular point in the network. We propose a new internal connection structure which aggregates only a sparse set of previous outputs at any given depth. Our experiments demonstrate this simple design change offers superior performance with fewer parameters and lower computational requirements. Moreover, we show that sparse aggregation allows networks to scale more robustly to 1000+ layers, thereby opening future avenues for training long-running visual processes.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes