LGJul 26, 2024

On Supernet Transfer Learning for Effective Task Adaptation

arXiv:2407.20279v3h-index: 6
Originality Highly original
AI Analysis

This addresses the problem of efficient and adaptable model design for AI practitioners, offering a practical improvement over existing methods.

The paper tackles the computational expense and architectural inflexibility of Neural Architecture Search (NAS) and transfer learning by introducing supernet transfer learning, which finetunes both weights and architectures to new tasks, resulting in models that are 3 to 5 times faster to discover and better than NAS from scratch.

Neural Architecture Search (NAS) methods have been shown to outperform hand-designed models and help to democratize AI. However, NAS methods often start from scratch with each new task, making them computationally expensive and limiting their applicability. Transfer learning is a practical alternative with the rise of ever-larger pretrained models. However, it is also bound to the architecture of the pretrained model, which inhibits proper adaptation of the architecture to different tasks, leading to suboptimal (and excessively large) models. We address both challenges at once by introducing a novel and practical method to \textit{transfer supernets}, which parameterize both weight and architecture priors, and efficiently finetune both to new tasks. This enables supernet transfer learning as a replacement for traditional transfer learning that also finetunes model architectures to new tasks. Through extensive experiments across multiple image classification tasks, we demonstrate that supernet transfer learning does not only drastically speed up the discovery of optimal models (3 to 5 times faster on average), but will also find better models than running NAS from scratch. The added model flexibility also increases the robustness of transfer learning, yielding positive transfer to even very different target datasets, especially with multi-dataset pretraining.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes