CL SD ASMay 2, 2023

The Pipeline System of ASR and NLU with MLM-based Data Augmentation toward STOP Low-resource Challenge

Hayato Futami, Jessica Huynh, Siddhant Arora, Shih-Lun Wu, Yosuke Kashiwagi, Yifan Peng, Brian Yan, Emiru Tsunoo, Shinji Watanabe

arXiv:2305.01194v20.96 citations

Originality Synthesis-oriented

AI Analysis

This work addresses low-resource domain adaptation for spoken language understanding, but it is incremental as it builds on existing models and techniques.

The paper tackled low-resource spoken language understanding by fine-tuning Whisper and BART with MLM-based data augmentation and retrieval methods, achieving an exact match accuracy of 69.15% and winning first place in the challenge.

This paper describes our system for the low-resource domain adaptation track (Track 3) in Spoken Language Understanding Grand Challenge, which is a part of ICASSP Signal Processing Grand Challenge 2023. In the track, we adopt a pipeline approach of ASR and NLU. For ASR, we fine-tune Whisper for each domain with upsampling. For NLU, we fine-tune BART on all the Track3 data and then on low-resource domain data. We apply masked LM (MLM) -based data augmentation, where some of input tokens and corresponding target labels are replaced using MLM. We also apply a retrieval-based approach, where model input is augmented with similar training samples. As a result, we achieved exact match (EM) accuracy 63.3/75.0 (average: 69.15) for reminder/weather domain, and won the 1st place at the challenge.

View on arXiv PDF

Similar