Neural Architecture Search without Training
This reduces the time and expense of NAS for researchers and practitioners, though it is incremental as it builds on existing NAS methods.
The paper tackles the high computational cost of Neural Architecture Search (NAS) by proposing a method to predict a network's trained accuracy from its untrained state using activation overlaps, enabling architecture search in seconds on a single GPU without training, and validates it on multiple benchmarks like NAS-Bench-101 and NATS-Bench.
The time and effort involved in hand-designing deep neural networks is immense. This has prompted the development of Neural Architecture Search (NAS) techniques to automate this design. However, NAS algorithms tend to be slow and expensive; they need to train vast numbers of candidate networks to inform the search process. This could be alleviated if we could partially predict a network's trained accuracy from its initial state. In this work, we examine the overlap of activations between datapoints in untrained networks and motivate how this can give a measure which is usefully indicative of a network's trained performance. We incorporate this measure into a simple algorithm that allows us to search for powerful networks without any training in a matter of seconds on a single GPU, and verify its effectiveness on NAS-Bench-101, NAS-Bench-201, NATS-Bench, and Network Design Spaces. Our approach can be readily combined with more expensive search methods; we examine a simple adaptation of regularised evolutionary search. Code for reproducing our experiments is available at https://github.com/BayesWatch/nas-without-training.