ASLGSDJun 9, 2024

Sparse Binarization for Fast Keyword Spotting

arXiv:2406.06634v11 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses efficiency challenges for keyword spotting on edge devices, offering improved speed and accuracy for real-time voice-activated applications, though it is incremental in optimizing existing methods.

The paper tackles the problem of deploying keyword spotting models on edge devices with limited computational power by proposing a sparse input representation with a linear classifier, achieving four times faster speed and better accuracy than the previous state-of-the-art edge-compatible model.

With the increasing prevalence of voice-activated devices and applications, keyword spotting (KWS) models enable users to interact with technology hands-free, enhancing convenience and accessibility in various contexts. Deploying KWS models on edge devices, such as smartphones and embedded systems, offers significant benefits for real-time applications, privacy, and bandwidth efficiency. However, these devices often possess limited computational power and memory. This necessitates optimizing neural network models for efficiency without significantly compromising their accuracy. To address these challenges, we propose a novel keyword-spotting model based on sparse input representation followed by a linear classifier. The model is four times faster than the previous state-of-the-art edge device-compatible model with better accuracy. We show that our method is also more robust in noisy environments while being fast. Our code is available at: https://github.com/jsvir/sparknet.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes