SDAIARASAug 4, 2022

Keyword Spotting System and Evaluation of Pruning and Quantization Methods on Low-power Edge Microcontrollers

arXiv:2208.02765v112 citationsh-index: 11
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of deploying deep learning for voice interactions on resource-constrained edge devices, but it is incremental as it focuses on optimizing existing methods for specific hardware.

The paper tackled the challenge of running keyword spotting (KWS) on low-power edge microcontrollers by developing a small-footprint CNN system and evaluating pruning and quantization methods, finding that structured pruning is more effective than unstructured pruning and that quantization and SIMD instructions improve performance.

Keyword spotting (KWS) is beneficial for voice-based user interactions with low-power devices at the edge. The edge devices are usually always-on, so edge computing brings bandwidth savings and privacy protection. The devices typically have limited memory spaces, computational performances, power and costs, for example, Cortex-M based microcontrollers. The challenge is to meet the high computation and low-latency requirements of deep learning on these devices. This paper firstly shows our small-footprint KWS system running on STM32F7 microcontroller with Cortex-M7 core @216MHz and 512KB static RAM. Our selected convolutional neural network (CNN) architecture has simplified number of operations for KWS to meet the constraint of edge devices. Our baseline system generates classification results for each 37ms including real-time audio feature extraction part. This paper further evaluates the actual performance for different pruning and quantization methods on microcontroller, including different granularity of sparsity, skipping zero weights, weight-prioritized loop order, and SIMD instruction. The result shows that for microcontrollers, there are considerable challenges for accelerate unstructured pruned models, and the structured pruning is more friendly than unstructured pruning. The result also verified that the performance improvement for quantization and SIMD instruction.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes