Behavior of Keyword Spotting Networks Under Noisy Conditions
This addresses the problem of unreliable keyword spotting in noisy environments for smart device users, representing an incremental improvement.
The paper investigates how state-of-the-art keyword spotting networks perform under high noise conditions, finding that their performance deteriorates, and suggests adaptive batch normalization as a technique to improve robustness when noise is unknown during training.
Keyword spotting (KWS) is becoming a ubiquitous need with the advancement in artificial intelligence and smart devices. Recent work in this field have focused on several different architectures to achieve good results on datasets with low to moderate noise. However, the performance of these models deteriorates under high noise conditions as shown by our experiments. In our paper, we present an extensive comparison between state-of-the-art KWS networks under various noisy conditions. We also suggest adaptive batch normalization as a technique to improve the performance of the networks when the noise files are unknown during the training phase. The results of such high noise characterization enable future work in developing models that perform better in the aforementioned conditions.