LGNov 22, 2023

REDS: Resource-Efficient Deep Subnetworks for Dynamic Resource Constraints

arXiv:2311.13349v33 citationsh-index: 33
Originality Highly original
AI Analysis

This addresses the challenge of dynamic resource variability for edge device deployments, offering a novel method for runtime adaptation.

The paper tackles the problem of adapting deep learning models to variable resource constraints on edge devices by introducing REDS, which achieves computational efficiency through structured sparsity and hardware-specific optimizations, resulting in adaptation times under 40μs and improved test accuracy across multiple benchmarks.

Deep learning models deployed on edge devices frequently encounter resource variability, which arises from fluctuating energy levels, timing constraints, or prioritization of other critical tasks within the system. State-of-the-art machine learning pipelines generate resource-agnostic models that are not capable to adapt at runtime. In this work, we introduce Resource-Efficient Deep Subnetworks (REDS) to tackle model adaptation to variable resources. In contrast to the state-of-the-art, REDS leverages structured sparsity constructively by exploiting permutation invariance of neurons, which allows for hardware-specific optimizations. Specifically, REDS achieves computational efficiency by (1) skipping sequential computational blocks identified by a novel iterative knapsack optimizer, and (2) taking advantage of data cache by re-arranging the order of operations in REDS computational graph. REDS supports conventional deep networks frequently deployed on the edge and provides computational benefits even for small and simple networks. We evaluate REDS on eight benchmark architectures trained on the Visual Wake Words, Google Speech Commands, Fashion-MNIST, CIFAR-10 and ImageNet-1K datasets, and test on four off-the-shelf mobile and embedded hardware platforms. We provide a theoretical result and empirical evidence demonstrating REDS' outstanding performance in terms of submodels' test set accuracy, and demonstrate an adaptation time in response to dynamic resource constraints of under 40$μ$s, utilizing a fully-connected network on Arduino Nano 33 BLE.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes