MLAILGNov 13, 2017

Simple And Efficient Architecture Search for Convolutional Neural Networks

arXiv:1711.04528v1244 citations
Originality Incremental advance
AI Analysis

This addresses the cumbersome trial-and-error process for designing CNN architectures, offering an efficient automated solution for researchers and practitioners, though it is incremental as it builds on existing search methods.

The paper tackles the problem of manually designing neural network architectures by proposing an automated search method based on hill climbing with network morphisms and cosine annealing, achieving competitive results such as an error rate below 6% on CIFAR-10 in 12 hours on a single GPU.

Neural networks have recently had a lot of success for many tasks. However, neural network architectures that perform well are still typically designed manually by experts in a cumbersome trial-and-error process. We propose a new method to automatically search for well-performing CNN architectures based on a simple hill climbing procedure whose operators apply network morphisms, followed by short optimization runs by cosine annealing. Surprisingly, this simple method yields competitive results, despite only requiring resources in the same order of magnitude as training a single network. E.g., on CIFAR-10, our method designs and trains networks with an error rate below 6% in only 12 hours on a single GPU; training for one day reduces this error further, to almost 5%.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes