Making Differentiable Architecture Search less local
This work addresses a specific problem in neural architecture search for researchers and practitioners, offering an incremental improvement to the DARTS method.
The paper tackles the performance collapse issue in Differentiable Neural Architecture Search (DARTS) by hypothesizing it arises from poor local optima and developing a more global optimization scheme, resulting in architectures with better test performance and fewer parameters.
Neural architecture search (NAS) is a recent methodology for automating the design of neural network architectures. Differentiable neural architecture search (DARTS) is a promising NAS approach that dramatically increases search efficiency. However, it has been shown to suffer from performance collapse, where the search often leads to detrimental architectures. Many recent works try to address this issue of DARTS by identifying indicators for early stopping, regularising the search objective to reduce the dominance of some operations, or changing the parameterisation of the search problem. In this work, we hypothesise that performance collapses can arise from poor local optima around typical initial architectures and weights. We address this issue by developing a more global optimisation scheme that is able to better explore the space without changing the DARTS problem formulation. Our experiments show that our changes in the search algorithm allow the discovery of architectures with both better test performance and fewer parameters.