AS SDSep 8, 2020

AutoKWS: Keyword Spotting with Differentiable Architecture Search

Bo Zhang, Wenfeng Li, Qingyuan Li, Weiji Zhuang, Xiangxiang Chu, Yujun Wang

arXiv:2009.03658v29.727 citations

Originality Incremental advance

AI Analysis

This work addresses the need for efficient keyword spotting in always-on audio devices, representing an incremental improvement through automated architecture search.

The paper tackled the challenge of designing keyword spotting models that balance high accuracy and low latency for smart audio devices, achieving 97.2% top-1 accuracy on the Google Speech Command Dataset v1 with about 100K parameters.

Smart audio devices are gated by an always-on lightweight keyword spotting program to reduce power consumption. It is however challenging to design models that have both high accuracy and low latency for accurate and fast responsiveness. Many efforts have been made to develop end-to-end neural networks, in which depthwise separable convolutions, temporal convolutions, and LSTMs are adopted as building units. Nonetheless, these networks designed with human expertise may not achieve an optimal trade-off in an expansive search space. In this paper, we propose to leverage recent advances in differentiable neural architecture search to discover more efficient networks. Our searched model attains 97.2% top-1 accuracy on Google Speech Command Dataset v1 with only nearly 100K parameters.

View on arXiv PDF

Similar