LG CL AS MLOct 11, 2019

Query-by-example on-device keyword spotting

Byeonggeun Kim, Mingu Lee, Jinkyu Lee, Yeonseok Kim, Kyuwoong Hwang

arXiv:1910.05171v310.357 citations

Originality Incremental advance

AI Analysis

This work addresses keyword spotting for on-device applications, but it is incremental as it builds on existing methods with user-specific adaptations.

The paper tackles the problem of user-specific keyword spotting on devices by introducing a query-by-example system that avoids out-of-vocabulary issues, showing promising performance for two English keywords.

A keyword spotting (KWS) system determines the existence of, usually predefined, keyword in a continuous speech stream. This paper presents a query-by-example on-device KWS system which is user-specific. The proposed system consists of two main steps: query enrollment and testing. In query enrollment step, phonetic posteriors are output by a small-footprint automatic speech recognition model based on connectionist temporal classification. Using the phonetic-level posteriorgram, hypothesis graph of finite-state transducer (FST) is built, thus can enroll any keywords thus avoiding an out-of-vocabulary problem. In testing, a log-likelihood is scored for input audio using the FST. We propose a threshold prediction method while using the user-specific keyword hypothesis only. The system generates query-specific negatives by rearranging each query utterance in waveform. The threshold is decided based on the enrollment queries and generated negatives. We tested two keywords in English, and the proposed work shows promising performance while preserving simplicity.

View on arXiv PDF

Similar