LG CVMar 16, 2022

Learning Where To Look -- Generative NAS is Surprisingly Efficient

Jovita Lukasik, Steffen Jung, Margret Keuper

arXiv:2203.08734v214.121 citationsh-index: 17Has Code

Originality Highly original

AI Analysis

This work addresses the challenge of reducing costly evaluations in automated neural architecture search, which is crucial for researchers and practitioners in machine learning seeking efficient model design.

The paper tackles the problem of efficient neural architecture search (NAS) by proposing a generative model paired with a surrogate predictor to iteratively generate architectures from promising latent subspaces, achieving state-of-the-art performance on ImageNet and outperforming existing methods on NAS benchmarks for single and multiple objectives.

The efficient, automated search for well-performing neural architectures (NAS) has drawn increasing attention in the recent past. Thereby, the predominant research objective is to reduce the necessity of costly evaluations of neural architectures while efficiently exploring large search spaces. To this aim, surrogate models embed architectures in a latent space and predict their performance, while generative models for neural architectures enable optimization-based search within the latent space the generator draws from. Both, surrogate and generative models, have the aim of facilitating query-efficient search in a well-structured latent space. In this paper, we further improve the trade-off between query-efficiency and promising architecture generation by leveraging advantages from both, efficient surrogate models and generative design. To this end, we propose a generative model, paired with a surrogate predictor, that iteratively learns to generate samples from increasingly promising latent subspaces. This approach leads to very effective and efficient architecture search, while keeping the query amount low. In addition, our approach allows in a straightforward manner to jointly optimize for multiple objectives such as accuracy and hardware latency. We show the benefit of this approach not only w.r.t. the optimization of architectures for highest classification accuracy but also in the context of hardware constraints and outperform state-of-the-art methods on several NAS benchmarks for single and multiple objectives. We also achieve state-of-the-art performance on ImageNet. The code is available at http://github.com/jovitalukasik/AG-Net .

View on arXiv PDF Code

Similar