LG MLFeb 10, 2023

DNArch: Learning Convolutional Neural Architectures by Backpropagation

arXiv:2302.05400v26.64 citationsh-index: 28

Originality Highly original

AI Analysis

This work addresses the challenge of manual architecture design in deep learning, offering a differentiable approach that can discover entire CNN architectures under computational constraints, though it is incremental relative to prior neural architecture search methods.

The authors tackled the problem of automatically designing convolutional neural network architectures by introducing DNArch, a method that jointly learns network weights and architecture parameters via backpropagation, achieving competitive performance on classification and dense prediction tasks across sequential and image data.

We present Differentiable Neural Architectures (DNArch), a method that jointly learns the weights and the architecture of Convolutional Neural Networks (CNNs) by backpropagation. In particular, DNArch allows learning (i) the size of convolutional kernels at each layer, (ii) the number of channels at each layer, (iii) the position and values of downsampling layers, and (iv) the depth of the network. To this end, DNArch views neural architectures as continuous multidimensional entities, and uses learnable differentiable masks along each dimension to control their size. Unlike existing methods, DNArch is not limited to a predefined set of possible neural components, but instead it is able to discover entire CNN architectures across all feasible combinations of kernel sizes, widths, depths and downsampling. Empirically, DNArch finds performant CNN architectures for several classification and dense prediction tasks on sequential and image data. When combined with a loss term that controls the network complexity, DNArch constrains its search to architectures that respect a predefined computational budget during training.

View on arXiv PDF

Similar