LG DSOct 16, 2020

An Approximation Algorithm for Optimal Subarchitecture Extraction

arXiv:2010.08512v12.31 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of efficiently selecting neural network architectures for practitioners, though it appears incremental as it builds on existing approximation methods.

The paper tackles the problem of finding optimal architectural parameters for deep neural networks under metrics like parameter size, inference speed, and error rate, presenting an approximation algorithm that behaves like an FPTAS with an approximation error of ρ ≤ |1 - ε| for many instances.

We consider the problem of finding the set of architectural parameters for a chosen deep neural network which is optimal under three metrics: parameter size, inference speed, and error rate. In this paper we state the problem formally, and present an approximation algorithm that, for a large subset of instances behaves like an FPTAS with an approximation error of $ρ\leq |{1- ε}|$, and that runs in $O(|Ξ| + |{W^*_T}|(1 + |Θ||{B}||Ξ|/({ε\, s^{3/2})}))$ steps, where $ε$ and $s$ are input parameters; $|{B}|$ is the batch size; $|{W^*_T}|$ denotes the cardinality of the largest weight set assignment; and $|Ξ|$ and $|Θ|$ are the cardinalities of the candidate architecture and hyperparameter spaces, respectively.

View on arXiv PDF

Similar