LGAICVMLJul 9, 2020

Accuracy Prediction with Non-neural Model for Neural Architecture Search

arXiv:2007.04785v317 citationsHas Code
AI Analysis

This work addresses the challenge of designing efficient and effective predictors for neural architecture search, which is incremental by applying non-neural models to a known bottleneck in NAS.

The paper tackles the problem of neural architecture search (NAS) by using a gradient boosting decision tree (GBDT) as an accuracy predictor instead of neural networks, achieving comparable or better prediction accuracy and significantly improving sample efficiency, such as being 22x more efficient than random search on NASBench-101 and reducing top-1 error rates on ImageNet to 24.2% and 23.4% with pruning.

Neural architecture search (NAS) with an accuracy predictor that predicts the accuracy of candidate architectures has drawn increasing attention due to its simplicity and effectiveness. Previous works usually employ neural network-based predictors which require more delicate design and are easy to overfit. Considering that most architectures are represented as sequences of discrete symbols which are more like tabular data and preferred by non-neural predictors, in this paper, we study an alternative approach which uses non-neural model for accuracy prediction. Specifically, as decision tree based models can better handle tabular data, we leverage gradient boosting decision tree (GBDT) as the predictor for NAS. We demonstrate that the GBDT predictor can achieve comparable (if not better) prediction accuracy than neural network based predictors. Moreover, considering that a compact search space can ease the search process, we propose to prune the search space gradually according to important features derived from GBDT. In this way, NAS can be performed by first pruning the search space and then searching a neural architecture, which is more efficient and effective. Experiments on NASBench-101 and ImageNet demonstrate the effectiveness of using GBDT as predictor for NAS: (1) On NASBench-101, it is 22x, 8x, and 6x more sample efficient than random search, regularized evolution, and Monte Carlo Tree Search (MCTS) in finding the global optimum; (2) It achieves 24.2% top-1 error rate on ImageNet, and further achieves 23.4% top-1 error rate on ImageNet when enhanced with search space pruning. Code is provided at https://github.com/renqianluo/GBDT-NAS.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes