MLLGSep 1, 2016

Neural Network Architecture Optimization through Submodularity and Supermodularity

arXiv:1609.00074v312 citations
Originality Incremental advance
AI Analysis

This work addresses architecture optimization for deep learning practitioners, offering incremental improvements through efficient algorithms.

The paper tackles the problem of optimizing neural network architectures to maximize accuracy under a computation time budget or minimize time under an accuracy requirement, by converting it into a subset selection problem and using greedy algorithms based on submodularity and supermodularity, resulting in more accurate or faster models as demonstrated in experiments.

Deep learning models' architectures, including depth and width, are key factors influencing models' performance, such as test accuracy and computation time. This paper solves two problems: given computation time budget, choose an architecture to maximize accuracy, and given accuracy requirement, choose an architecture to minimize computation time. We convert this architecture optimization into a subset selection problem. With accuracy's submodularity and computation time's supermodularity, we propose efficient greedy optimization algorithms. The experiments demonstrate our algorithm's ability to find more accurate models or faster models. By analyzing architecture evolution with growing time budget, we discuss relationships among accuracy, time and architecture, and give suggestions on neural network architecture design.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes