LGCVMLOct 12, 2018

Graph HyperNetworks for Neural Architecture Search

arXiv:1810.05749v3303 citations
Originality Incremental advance
AI Analysis

This work addresses the efficiency problem in NAS for machine learning practitioners, offering a faster and more cost-effective method, though it is incremental as it builds on hypernetworks and graph neural networks.

The paper tackles the high computational cost of neural architecture search (NAS) by proposing Graph HyperNetworks (GHNs), which generate weights for architectures via graph neural network inference, enabling faster search. GHNs achieved nearly 10 times faster search than other random methods on CIFAR-10 and ImageNet, and found networks with better speed-accuracy tradeoffs than state-of-the-art manual designs.

Neural architecture search (NAS) automatically finds the best task-specific neural network topology, outperforming many manual architecture designs. However, it can be prohibitively expensive as the search requires training thousands of different networks, while each can last for hours. In this work, we propose the Graph HyperNetwork (GHN) to amortize the search cost: given an architecture, it directly generates the weights by running inference on a graph neural network. GHNs model the topology of an architecture and therefore can predict network performance more accurately than regular hypernetworks and premature early stopping. To perform NAS, we randomly sample architectures and use the validation accuracy of networks with GHN generated weights as the surrogate search signal. GHNs are fast -- they can search nearly 10 times faster than other random search methods on CIFAR-10 and ImageNet. GHNs can be further extended to the anytime prediction setting, where they have found networks with better speed-accuracy tradeoff than the state-of-the-art manual designs.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes