LG AISep 16, 2025

HyperNAS: Enhancing Architecture Representation for NAS Predictor via Hypernetwork

Jindi Lv, Yuhao Zhou, Yuxin Tian, Qing Ye, Wentao Feng, Jiancheng Lv

arXiv:2509.18151v111.42 citationsh-index: 9

Originality Highly original

AI Analysis

This work addresses the time-intensive performance evaluation bottleneck in NAS for researchers and practitioners, offering a significant improvement in few-shot scenarios.

The paper tackles the problem of poor generalization in neural predictors for Neural Architecture Search (NAS) by proposing HyperNAS, a novel paradigm that enhances architecture representation learning, achieving state-of-the-art results such as 97.60% top-1 accuracy on CIFAR-10 and 82.4% on ImageNet with at least 5.0× fewer samples.

Time-intensive performance evaluations significantly impede progress in Neural Architecture Search (NAS). To address this, neural predictors leverage surrogate models trained on proxy datasets, allowing for direct performance predictions for new architectures. However, these predictors often exhibit poor generalization due to their limited ability to capture intricate relationships among various architectures. In this paper, we propose HyperNAS, a novel neural predictor paradigm for enhancing architecture representation learning. HyperNAS consists of two primary components: a global encoding scheme and a shared hypernetwork. The global encoding scheme is devised to capture the comprehensive macro-structure information, while the shared hypernetwork serves as an auxiliary task to enhance the investigation of inter-architecture patterns. To ensure training stability, we further develop a dynamic adaptive multi-task loss to facilitate personalized exploration on the Pareto front. Extensive experiments across five representative search spaces, including ViTs, demonstrate the advantages of HyperNAS, particularly in few-shot scenarios. For instance, HyperNAS strikes new state-of-the-art results, with 97.60\% top-1 accuracy on CIFAR-10 and 82.4\% top-1 accuracy on ImageNet, using at least 5.0$\times$ fewer samples.

View on arXiv PDF

Similar