LGDec 6, 2013

Understanding Deep Architectures using a Recursive Convolutional Network

David Eigen, Jason Rolfe, Rob Fergus, Yann LeCun

arXiv:1312.1847v2148 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of efficient network design for machine learning practitioners, but it is incremental as it builds on existing understanding of convolutional architectures.

The paper tackled the challenge of sizing convolutional networks by assessing the independent contributions of layers, feature maps, and parameters using a recursive convolutional network with tied weights. It found that increasing layers and parameters has clear benefits, while feature maps are ancillary, with results confirming that adding layers boosts computational power and suggesting focus should shift to parameter counts.

A key challenge in designing convolutional network models is sizing them appropriately. Many factors are involved in these decisions, including number of layers, feature maps, kernel sizes, etc. Complicating this further is the fact that each of these influence not only the numbers and dimensions of the activation units, but also the total number of parameters. In this paper we focus on assessing the independent contributions of three of these linked variables: The numbers of layers, feature maps, and parameters. To accomplish this, we employ a recursive convolutional network whose weights are tied between layers; this allows us to vary each of the three factors in a controlled setting. We find that while increasing the numbers of layers and parameters each have clear benefit, the number of feature maps (and hence dimensionality of the representation) appears ancillary, and finds most of its benefit through the introduction of more weights. Our results (i) empirically confirm the notion that adding layers alone increases computational power, within the context of convolutional layers, and (ii) suggest that precise sizing of convolutional feature map dimensions is itself of little concern; more attention should be paid to the number of parameters in these layers instead.

View on arXiv PDF

Similar