LGMLFeb 21, 2020

DSNAS: Direct Neural Architecture Search without Parameter Retraining

arXiv:2002.09128v2146 citationsHas Code
AI Analysis

This work addresses inefficiencies in NAS for computer vision tasks, offering a more direct and time-saving approach, though it is incremental as it builds on existing differentiable NAS methods.

The paper tackles the problem of poor correlation between architecture search and final performance in neural architecture search (NAS) by proposing DSNAS, a differentiable framework that directly optimizes both architecture and parameters without retraining, achieving 74.4% accuracy on ImageNet in 420 GPU hours and reducing total time by over 34% compared to two-stage methods.

If NAS methods are solutions, what is the problem? Most existing NAS methods require two-stage parameter optimization. However, performance of the same architecture in the two stages correlates poorly. In this work, we propose a new problem definition for NAS, task-specific end-to-end, based on this observation. We argue that given a computer vision task for which a NAS method is expected, this definition can reduce the vaguely-defined NAS evaluation to i) accuracy of this task and ii) the total computation consumed to finally obtain a model with satisfying accuracy. Seeing that most existing methods do not solve this problem directly, we propose DSNAS, an efficient differentiable NAS framework that simultaneously optimizes architecture and parameters with a low-biased Monte Carlo estimate. Child networks derived from DSNAS can be deployed directly without parameter retraining. Comparing with two-stage methods, DSNAS successfully discovers networks with comparable accuracy (74.4%) on ImageNet in 420 GPU hours, reducing the total time by more than 34%. Our implementation is available at https://github.com/SNAS-Series/SNAS-Series.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes