Variational Depth Search in ResNets
This work addresses the computational cost of neural architecture search for researchers and practitioners, though it is incremental as it focuses specifically on depth in residual networks.
The paper tackled the problem of efficiently searching for optimal network depth in residual networks by proposing a variational objective for one-shot architecture search, resulting in pruned networks that maintain competitive accuracy on MNIST, Fashion-MNIST, and SVHN datasets and provide better-calibrated uncertainty estimates.
One-shot neural architecture search allows joint learning of weights and network architecture, reducing computational cost. We limit our search space to the depth of residual networks and formulate an analytically tractable variational objective that allows for obtaining an unbiased approximate posterior over depths in one-shot. We propose a heuristic to prune our networks based on this distribution. We compare our proposed method against manual search over network depths on the MNIST, Fashion-MNIST, SVHN datasets. We find that pruned networks do not incur a loss in predictive performance, obtaining accuracies competitive with unpruned networks. Marginalising over depth allows us to obtain better-calibrated test-time uncertainty estimates than regular networks, in a single forward pass.