LGNEMLMay 21, 2019

Adaptive Stochastic Natural Gradient Method for One-Shot Neural Architecture Search

arXiv:1905.08537v195 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of making NAS more robust and widely applicable for practitioners by reducing the need for manual tuning, though it is incremental as it builds on existing gradient-based NAS methods.

The authors tackled the sensitivity of neural architecture search (NAS) methods to hyperparameters like step-size and search space by developing a generic optimization framework that uses stochastic relaxation and an adaptive stochastic natural gradient method, achieving near state-of-the-art performances on image classification and inpainting tasks with low computational budgets.

High sensitivity of neural architecture search (NAS) methods against their input such as step-size (i.e., learning rate) and search space prevents practitioners from applying them out-of-the-box to their own problems, albeit its purpose is to automate a part of tuning process. Aiming at a fast, robust, and widely-applicable NAS, we develop a generic optimization framework for NAS. We turn a coupled optimization of connection weights and neural architecture into a differentiable optimization by means of stochastic relaxation. It accepts arbitrary search space (widely-applicable) and enables to employ a gradient-based simultaneous optimization of weights and architecture (fast). We propose a stochastic natural gradient method with an adaptive step-size mechanism built upon our theoretical investigation (robust). Despite its simplicity and no problem-dependent parameter tuning, our method exhibited near state-of-the-art performances with low computational budgets both on image classification and inpainting tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes