LGDSOct 16, 2020

An Approximation Algorithm for Optimal Subarchitecture Extraction

arXiv:2010.08512v11 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of efficiently selecting neural network architectures for practitioners, though it appears incremental as it builds on existing approximation methods.

The paper tackles the problem of finding optimal architectural parameters for deep neural networks under metrics like parameter size, inference speed, and error rate, presenting an approximation algorithm that behaves like an FPTAS with an approximation error of ρ ≤ |1 - ε| for many instances.

We consider the problem of finding the set of architectural parameters for a chosen deep neural network which is optimal under three metrics: parameter size, inference speed, and error rate. In this paper we state the problem formally, and present an approximation algorithm that, for a large subset of instances behaves like an FPTAS with an approximation error of $ρ\leq |{1- ε}|$, and that runs in $O(|Ξ| + |{W^*_T}|(1 + |Θ||{B}||Ξ|/({ε\, s^{3/2})}))$ steps, where $ε$ and $s$ are input parameters; $|{B}|$ is the batch size; $|{W^*_T}|$ denotes the cardinality of the largest weight set assignment; and $|Ξ|$ and $|Θ|$ are the cardinalities of the candidate architecture and hyperparameter spaces, respectively.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes