ASAILGNESDOct 8, 2021

Neural Model Reprogramming with Similarity Based Mapping for Low-Resource Spoken Command Recognition

arXiv:2110.03894v56 citations
Originality Incremental advance
AI Analysis

This addresses the problem of speech recognition in low-resource languages for users in those regions, but it is incremental as it builds on existing adversarial reprogramming techniques.

The paper tackles low-resource spoken command recognition by proposing an adversarial reprogramming approach with similarity-based label mapping and transfer learning, achieving state-of-the-art results on Arabic and Lithuanian datasets using limited training data.

In this study, we propose a novel adversarial reprogramming (AR) approach for low-resource spoken command recognition (SCR), and build an AR-SCR system. The AR procedure aims to modify the acoustic signals (from the target domain) to repurpose a pretrained SCR model (from the source domain). To solve the label mismatches between source and target domains, and further improve the stability of AR, we propose a novel similarity-based label mapping technique to align classes. In addition, the transfer learning (TL) technique is combined with the original AR process to improve the model adaptation capability. We evaluate the proposed AR-SCR system on three low-resource SCR datasets, including Arabic, Lithuanian, and dysarthric Mandarin speech. Experimental results show that with a pretrained AM trained on a large-scale English dataset, the proposed AR-SCR system outperforms the current state-of-the-art results on Arabic and Lithuanian speech commands datasets, with only a limited amount of training data.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes