CVSep 23, 2019

Scheduled Differentiable Architecture Search for Visual Recognition

arXiv:1909.10236v13 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of reducing human effort in architecture design for computer vision researchers, though it is incremental as it builds on differentiable architecture search.

The paper tackles the challenge of automatically designing convolutional neural network architectures for visual recognition by introducing Scheduled Differentiable Architecture Search (SDAS), which progressively fixes operations during training, resulting in a 2-fold faster search than existing methods and superior performance on datasets like CIFAR10 and ImageNet.

Convolutional Neural Networks (CNN) have been regarded as a capable class of models for visual recognition problems. Nevertheless, it is not trivial to develop generic and powerful network architectures, which requires significant efforts of human experts. In this paper, we introduce a new idea for automatically exploring architectures on a remould of Differentiable Architecture Search (DAS), which possesses the efficient search via gradient descent. Specifically, we present Scheduled Differentiable Architecture Search (SDAS) for both image and video recognition that nicely integrates the selection of operations during training with a schedule. Technically, an architecture or a cell is represented as a directed graph. Our SDAS gradually fixes the operations on the edges in the graph in a progressive and scheduled manner, as opposed to a one-step decision of operations for all the edges once the training completes in existing DAS, which may make the architecture brittle. Moreover, we enlarge the search space of SDAS particularly for video recognition by devising several unique operations to encode spatio-temporal dynamics and demonstrate the impact in affecting the architecture search of SDAS. Extensive experiments of architecture learning are conducted on CIFAR10, Kinetics10, UCF101 and HMDB51 datasets, and superior results are reported when comparing to DAS method. More remarkably, the search by our SDAS is around 2-fold faster than DAS. When transferring the learnt cells on CIFAR10 and Kinetics10 respectively to large-scale ImageNet and Kinetics400 datasets, the constructed network also outperforms several state-of-the-art hand-crafted structures.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes