CVLGSep 30, 2020

AttendNets: Tiny Deep Image Recognition Neural Networks for the Edge via Visual Attention Condensers

arXiv:2009.14385v123 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of efficient deep learning for edge devices, offering a novel architecture with significant improvements in accuracy and resource usage, though it appears incremental in the context of existing efficient network designs.

The authors tackled the challenge of deploying deep neural networks for on-device image recognition in TinyML applications by introducing AttendNets, which achieved ~7.2% higher accuracy with ~3× fewer operations and ~4.17× fewer parameters than MobileNet-V1 on the ImageNet50 benchmark.

While significant advances in deep learning has resulted in state-of-the-art performance across a large number of complex visual perception tasks, the widespread deployment of deep neural networks for TinyML applications involving on-device, low-power image recognition remains a big challenge given the complexity of deep neural networks. In this study, we introduce AttendNets, low-precision, highly compact deep neural networks tailored for on-device image recognition. More specifically, AttendNets possess deep self-attention architectures based on visual attention condensers, which extends on the recently introduced stand-alone attention condensers to improve spatial-channel selective attention. Furthermore, AttendNets have unique machine-designed macroarchitecture and microarchitecture designs achieved via a machine-driven design exploration strategy. Experimental results on ImageNet$_{50}$ benchmark dataset for the task of on-device image recognition showed that AttendNets have significantly lower architectural and computational complexity when compared to several deep neural networks in research literature designed for efficiency while achieving highest accuracies (with the smallest AttendNet achieving $\sim$7.2% higher accuracy, while requiring $\sim$3$\times$ fewer multiply-add operations, $\sim$4.17$\times$ fewer parameters, and $\sim$16.7$\times$ lower weight memory requirements than MobileNet-V1). Based on these promising results, AttendNets illustrate the effectiveness of visual attention condensers as building blocks for enabling various on-device visual perception tasks for TinyML applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes