LGJul 28, 2023

Shrink-Perturb Improves Architecture Mixing during Population Based Training for Neural Architecture Search

arXiv:2307.15621v11 citationsh-index: 38Has Code
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficient architecture search in machine learning, offering a parallelizable method that directly yields usable networks without retraining, though it appears incremental as an adaptation of existing PBT techniques.

The authors tackled the problem of Neural Architecture Search (NAS) by proposing PBT-NAS, which simultaneously trains and mixes neural networks using a population-based approach with shrink-perturb weight inheritance, achieving superior performance on image generation and reinforcement learning tasks compared to random search and mutation-based PBT baselines.

In this work, we show that simultaneously training and mixing neural networks is a promising way to conduct Neural Architecture Search (NAS). For hyperparameter optimization, reusing the partially trained weights allows for efficient search, as was previously demonstrated by the Population Based Training (PBT) algorithm. We propose PBT-NAS, an adaptation of PBT to NAS where architectures are improved during training by replacing poorly-performing networks in a population with the result of mixing well-performing ones and inheriting the weights using the shrink-perturb technique. After PBT-NAS terminates, the created networks can be directly used without retraining. PBT-NAS is highly parallelizable and effective: on challenging tasks (image generation and reinforcement learning) PBT-NAS achieves superior performance compared to baselines (random search and mutation-based PBT).

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes