LGJun 3, 2023

Few-Shot Open-Set Learning for On-Device Customization of KeyWord Spotting Systems

arXiv:2306.02161v116 citationsh-index: 75
Originality Incremental advance
AI Analysis

This work addresses the challenge of personalizing keyword spotting for users without requiring large datasets, though it is incremental as it builds on existing few-shot and prototype-based methods.

The paper tackles the problem of enabling fast on-device customization for keyword spotting systems by using few-shot learning to classify user-defined keywords with limited data, achieving up to 76% accuracy in a 10-shot scenario while maintaining a 5% false acceptance rate for unknown data.

A personalized KeyWord Spotting (KWS) pipeline typically requires the training of a Deep Learning model on a large set of user-defined speech utterances, preventing fast customization directly applied on-device. To fill this gap, this paper investigates few-shot learning methods for open-set KWS classification by combining a deep feature encoder with a prototype-based classifier. With user-defined keywords from 10 classes of the Google Speech Command dataset, our study reports an accuracy of up to 76% in a 10-shot scenario while the false acceptance rate of unknown data is kept to 5%. In the analyzed settings, the usage of the triplet loss to train an encoder with normalized output features performs better than the prototypical networks jointly trained with a generator of dummy unknown-class prototypes. This design is also more effective than encoders trained on a classification problem and features fewer parameters than other iso-accuracy approaches.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes