CVApr 10, 2020

ModuleNet: Knowledge-inherited Neural Architecture Search

Yaran Chen, Ruiyuan Gao, Fenggang Liu, Dongbin Zhao

arXiv:2004.05020v210.136 citations

Originality Incremental advance

AI Analysis

This work addresses the computational inefficiency and knowledge reuse challenges in NAS for deep learning researchers, though it is incremental as it builds on existing NAS and model decomposition methods.

The paper tackles the problem of Neural Architecture Search (NAS) neglecting existing model knowledge and high computational costs by proposing ModuleNet, a NAS algorithm that inherits knowledge from existing convolutional neural networks, decomposing them into weighted modules for efficient search, and achieving better performance on CIFAR10 and CIFAR100 datasets compared to original architectures.

Although Neural Architecture Search (NAS) can bring improvement to deep models, they always neglect precious knowledge of existing models. The computation and time costing property in NAS also means that we should not start from scratch to search, but make every attempt to reuse the existing knowledge. In this paper, we discuss what kind of knowledge in a model can and should be used for new architecture design. Then, we propose a new NAS algorithm, namely ModuleNet, which can fully inherit knowledge from existing convolutional neural networks. To make full use of existing models, we decompose existing models into different \textit{module}s which also keep their weights, consisting of a knowledge base. Then we sample and search for new architecture according to the knowledge base. Unlike previous search algorithms, and benefiting from inherited knowledge, our method is able to directly search for architectures in the macro space by NSGA-II algorithm without tuning parameters in these \textit{module}s. Experiments show that our strategy can efficiently evaluate the performance of new architecture even without tuning weights in convolutional layers. With the help of knowledge we inherited, our search results can always achieve better performance on various datasets (CIFAR10, CIFAR100) over original architectures.

View on arXiv PDF

Similar