LGFeb 17, 2021

Firefly Neural Architecture Descent: a General Approach for Growing Neural Networks

arXiv:2102.08574v272 citations
AI Analysis

This work addresses the challenge of learning resource-efficient neural architectures that avoid catastrophic forgetting in continual learning, representing a novel method for a known bottleneck.

The authors tackled the problem of jointly optimizing neural network parameters and architectures by proposing firefly neural architecture descent, a framework that dynamically grows networks through iterative steepest descent and Taylor approximation, achieving higher average accuracy with smaller networks than state-of-the-art methods on a continual image classification task.

We propose firefly neural architecture descent, a general framework for progressively and dynamically growing neural networks to jointly optimize the networks' parameters and architectures. Our method works in a steepest descent fashion, which iteratively finds the best network within a functional neighborhood of the original network that includes a diverse set of candidate network structures. By using Taylor approximation, the optimal network structure in the neighborhood can be found with a greedy selection procedure. We show that firefly descent can flexibly grow networks both wider and deeper, and can be applied to learn accurate but resource-efficient neural architectures that avoid catastrophic forgetting in continual learning. Empirically, firefly descent achieves promising results on both neural architecture search and continual learning. In particular, on a challenging continual image classification task, it learns networks that are smaller in size but have higher average accuracy than those learned by the state-of-the-art methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes