UDC: Unified DNAS for Compressible TinyML Models
This addresses the problem of limited memory capacity for deploying TinyML models on IoT devices, representing a strong specific gain rather than a broad paradigm shift.
The paper tackled the challenge of designing compressible neural networks for TinyML on low-cost IoT hardware by introducing Unified DNAS for Compressible (UDC) NNs, which achieved up to 3.35x smaller models at iso-accuracy or 6.25% higher accuracy at iso-model size on ImageNet compared to previous work.
Deploying TinyML models on low-cost IoT hardware is very challenging, due to limited device memory capacity. Neural processing unit (NPU) hardware address the memory challenge by using model compression to exploit weight quantization and sparsity to fit more parameters in the same footprint. However, designing compressible neural networks (NNs) is challenging, as it expands the design space across which we must make balanced trade-offs. This paper demonstrates Unified DNAS for Compressible (UDC) NNs, which explores a large search space to generate state-of-the-art compressible NNs for NPU. ImageNet results show UDC networks are up to $3.35\times$ smaller (iso-accuracy) or 6.25% more accurate (iso-model size) than previous work.