LGCVMLMar 9, 2020

How to Train Your Super-Net: An Analysis of Training Heuristics in Weight-Sharing NAS

arXiv:2003.04276v237 citations
AI Analysis

This work addresses the problem of inconsistent comparisons in NAS for researchers, providing a reproducible baseline, but it is incremental as it analyzes existing methods rather than introducing new ones.

The paper systematically evaluates training heuristics and hyperparameters in weight-sharing neural architecture search (NAS), finding that some common heuristics harm the correlation between super-net and stand-alone performance, and highlighting the strong influence of certain hyperparameters and architectural choices.

Weight sharing promises to make neural architecture search (NAS) tractable even on commodity hardware. Existing methods in this space rely on a diverse set of heuristics to design and train the shared-weight backbone network, a.k.a. the super-net. Since heuristics and hyperparameters substantially vary across different methods, a fair comparison between them can only be achieved by systematically analyzing the influence of these factors. In this paper, we therefore provide a systematic evaluation of the heuristics and hyperparameters that are frequently employed by weight-sharing NAS algorithms. Our analysis uncovers that some commonly-used heuristics for super-net training negatively impact the correlation between super-net and stand-alone performance, and evidences the strong influence of certain hyperparameters and architectural choices. Our code and experiments set a strong and reproducible baseline that future works can build on.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes