SDASNov 1, 2018

Sequence-to-sequence Models for Small-Footprint Keyword Spotting

arXiv:1811.00348v15 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of efficient keyword spotting for real-world wake-up systems, presenting an incremental improvement over existing attention-based models.

The paper tackles keyword spotting by proposing a sequence-to-sequence model that simplifies production pipelines while meeting high accuracy, low-latency, and small-footprint requirements, achieving a false rejection rate of ~3.05% at 0.1 false alarms per hour with 73K parameters.

In this paper, we propose a sequence-to-sequence model for keyword spotting (KWS). Compared with other end-to-end architectures for KWS, our model simplifies the pipelines of production-quality KWS system and satisfies the requirement of high accuracy, low-latency, and small-footprint. We also evaluate the performances of different encoder architectures, which include LSTM and GRU. Experiments on the real-world wake-up data show that our approach outperforms the recently proposed attention-based end-to-end model. Specifically speaking, with 73K parameters, our sequence-to-sequence model achieves $\sim$3.05\% false rejection rate (FRR) at 0.1 false alarm (FA) per hour.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes