LG NEMay 15

Towards Code-Oriented LM Embeddings for Surrogate-Assisted Neural Architecture Search

Pranav Somu, Advay Balakrishnan, Stepan Kravtsov, Aaron McDaniel, Jason Zutty

arXiv:2605.1564924.6

Predicted impact top 67% in LG · last 90 daysOriginality Incremental advance

AI Analysis

For researchers in neural architecture search, this work provides a cheap and effective embedding strategy that reduces the computational cost of surrogate modeling.

The paper introduces Code-Oriented LM Embeddings (COLE), a low-cost method using off-the-shelf language models to extract features from PyTorch code representations of neural architectures, eliminating the need for expensive fine-tuning in surrogate-assisted NAS. Experiments on NAS-Bench-201 show COLE achieves a 34% reduction in evaluation budget to reach within 1% of the best architecture's test accuracy on CIFAR-100.

Developing effective surrogates (performance predictors) for Neural Architecture Search (NAS) typically requires expensive fine-tuning or the engineering of complex representations. We propose a low-cost embedding strategy that leverages the inductive bias of Language Models (LMs) to eliminate these overheads. By representing architectures as PyTorch class definition text, we demonstrate that off-the-shelf LMs act as competitive feature extractors without NAS-specialized fine-tuning. The final predictor is constructed by passing the extracted Code-Oriented LM Embeddings (COLE) through a lightweight regression head. We also investigate strategies to improve embedding quality and utilization. Our experiments on the NAS-Bench-201 and einspace search spaces reveal that raw code inputs yield higher predictive performance than other text-based encodings (e.g., ONNX-to-text encodings) when using frozen LMs. We also observe COLE drives superior surrogate-assisted search using the BANANAS algorithm in NAS-Bench-201. When optimizing for CIFAR-100 performance, replacing structural path encodings with COLE for architecture representation allows for a 34% decrease in the evaluation budget required to reach within 1% of the fittest architecture in the search space (by test accuracy). As any neural architecture can be represented as code, these findings establish COLE as a versatile and efficient foundation for advancing NAS.

View on arXiv PDF

Similar