CVAISep 19, 2020

MSR-DARTS: Minimum Stable Rank of Differentiable Architecture Search

arXiv:2009.09209v21 citationsHas Code
AI Analysis

This addresses a specific bottleneck in neural architecture search for researchers, offering an incremental improvement to enhance model generalization.

The paper tackles the overfitting problem in differentiable architecture search (DARTS) by proposing MSR-DARTS, which uses a minimum stable rank criterion to select architectures for better generalization, achieving error rates of 2.54% on CIFAR-10 and 23.9% on ImageNet.

In neural architecture search (NAS), differentiable architecture search (DARTS) has recently attracted much attention due to its high efficiency. It defines an over-parameterized network with mixed edges, each of which represents all operator candidates, and jointly optimizes the weights of the network and its architecture in an alternating manner. However, this method finds a model with the weights converging faster than the others, and such a model with fastest convergence often leads to overfitting. Accordingly, the resulting model cannot always be well-generalized. To overcome this problem, we propose a method called minimum stable rank DARTS (MSR-DARTS), for finding a model with the best generalization error by replacing architecture optimization with the selection process using the minimum stable rank criterion. Specifically, a convolution operator is represented by a matrix, and MSR-DARTS selects the one with the smallest stable rank. We evaluated MSR-DARTS on CIFAR-10 and ImageNet datasets. It achieves an error rate of 2.54% with 4.0M parameters within 0.3 GPU-days on CIFAR-10, and a top-1 error rate of 23.9% on ImageNet. The official code is available at https://github.com/mtaecchhi/msrdarts.git.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes