Implementing Keyword Spotting on the MCUX947 Microcontroller with Integrated NPU
This enables efficient, low-power voice interfaces for embedded devices, but it is incremental as it applies known methods to a new hardware platform.
The paper tackled real-time keyword spotting on a resource-constrained microcontroller by implementing a CNN-based system with quantization, achieving 97.06% accuracy and a 59x speedup using an integrated NPU.
This paper presents a keyword spotting (KWS) system implemented on the NXP MCXN947 microcontroller with an integrated Neural Processing Unit (NPU), enabling real-time voice interaction on resource-constrained devices. The system combines MFCC feature extraction with a CNN classifier, optimized using Quantization Aware Training to reduce model size with minimal accuracy drop. Experimental results demonstrate a 59x speedup in inference time when leveraging the NPU compared to CPU-only execution, achieving 97.06% accuracy with a model size of 30.58 KB, demonstrating the feasibility of efficient, low-power voice interfaces on embedded platforms.