LG MLFeb 21, 2020

DSNAS: Direct Neural Architecture Search without Parameter Retraining

Shoukang Hu, Sirui Xie, Hehui Zheng, Chunxiao Liu, Jianping Shi, Xunying Liu, Dahua Lin

arXiv:2002.09128v228.5146 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses inefficiencies in NAS for computer vision tasks, offering a more direct and time-saving approach, though it is incremental as it builds on existing differentiable NAS methods.

The paper tackles the problem of poor correlation between architecture search and final performance in neural architecture search (NAS) by proposing DSNAS, a differentiable framework that directly optimizes both architecture and parameters without retraining, achieving 74.4% accuracy on ImageNet in 420 GPU hours and reducing total time by over 34% compared to two-stage methods.

If NAS methods are solutions, what is the problem? Most existing NAS methods require two-stage parameter optimization. However, performance of the same architecture in the two stages correlates poorly. In this work, we propose a new problem definition for NAS, task-specific end-to-end, based on this observation. We argue that given a computer vision task for which a NAS method is expected, this definition can reduce the vaguely-defined NAS evaluation to i) accuracy of this task and ii) the total computation consumed to finally obtain a model with satisfying accuracy. Seeing that most existing methods do not solve this problem directly, we propose DSNAS, an efficient differentiable NAS framework that simultaneously optimizes architecture and parameters with a low-biased Monte Carlo estimate. Child networks derived from DSNAS can be deployed directly without parameter retraining. Comparing with two-stage methods, DSNAS successfully discovers networks with comparable accuracy (74.4%) on ImageNet in 420 GPU hours, reducing the total time by more than 34%. Our implementation is available at https://github.com/SNAS-Series/SNAS-Series.

View on arXiv PDF Code

Similar