CVAug 18, 2025

Dextr: Zero-Shot Neural Architecture Search with Singular Value Decomposition and Extrinsic Curvature

Rohan Asthana, Joschua Conrad, Maurits Ortmanns, Vasileios Belagiannis

arXiv:2508.12977v13.6h-index: 3Has CodeTrans. Mach. Learn. Res.

Originality Highly original

AI Analysis

This work addresses the limitation of requiring labeled data in zero-shot NAS, which is often unavailable in real-world settings, by providing a more practical and efficient solution for researchers and practitioners in machine learning.

The paper tackles the problem of zero-shot neural architecture search (NAS) by proposing a new zero-cost proxy that eliminates the need for labeled data, using singular value decomposition and extrinsic curvature to predict network performance with a single label-free sample. The method achieves superior performance on multiple benchmarks, including NAS-Bench-101, NAS-Bench-201, and TransNAS-Bench-101-micro, with high efficiency.

Zero-shot Neural Architecture Search (NAS) typically optimises the architecture search process by exploiting the network or gradient properties at initialisation through zero-cost proxies. The existing proxies often rely on labelled data, which is usually unavailable in real-world settings. Furthermore, the majority of the current methods focus either on optimising the convergence and generalisation attributes or solely on the expressivity of the network architectures. To address both limitations, we first demonstrate how channel collinearity affects the convergence and generalisation properties of a neural network. Then, by incorporating the convergence, generalisation and expressivity in one approach, we propose a zero-cost proxy that omits the requirement of labelled data for its computation. In particular, we leverage the Singular Value Decomposition (SVD) of the neural network layer features and the extrinsic curvature of the network output to design our proxy. %As a result, the proposed proxy is formulated as the simplified harmonic mean of the logarithms of two key components: the sum of the inverse of the feature condition number and the extrinsic curvature of the network output. Our approach enables accurate prediction of network performance on test data using only a single label-free data sample. Our extensive evaluation includes a total of six experiments, including the Convolutional Neural Network (CNN) search space, i.e. DARTS and the Transformer search space, i.e. AutoFormer. The proposed proxy demonstrates a superior performance on multiple correlation benchmarks, including NAS-Bench-101, NAS-Bench-201, and TransNAS-Bench-101-micro; as well as on the NAS task within the DARTS and the AutoFormer search space, all while being notably efficient. The code is available at https://github.com/rohanasthana/Dextr.

View on arXiv PDF Code

Similar