LGCVOct 24, 2023

NetDistiller: Empowering Tiny Deep Learning via In-Situ Distillation

arXiv:2310.19820v1h-index: 20Has Code
Originality Incremental advance
AI Analysis

This addresses the problem of deploying efficient deep learning on resource-constrained edge devices, representing an incremental improvement in distillation techniques.

The paper tackles the challenge of improving task accuracy for tiny neural networks (TNNs) on edge devices by proposing NetDistiller, a framework that uses in-situ distillation with a weight-sharing teacher, achieving higher accuracy over state-of-the-art methods in diverse tasks.

Boosting the task accuracy of tiny neural networks (TNNs) has become a fundamental challenge for enabling the deployments of TNNs on edge devices which are constrained by strict limitations in terms of memory, computation, bandwidth, and power supply. To this end, we propose a framework called NetDistiller to boost the achievable accuracy of TNNs by treating them as sub-networks of a weight-sharing teacher constructed by expanding the number of channels of the TNN. Specifically, the target TNN model is jointly trained with the weight-sharing teacher model via (1) gradient surgery to tackle the gradient conflicts between them and (2) uncertainty-aware distillation to mitigate the overfitting of the teacher model. Extensive experiments across diverse tasks validate NetDistiller's effectiveness in boosting TNNs' achievable accuracy over state-of-the-art methods. Our code is available at https://github.com/GATECH-EIC/NetDistiller.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes