The NUS-HLT System for ICASSP2024 ICMC-ASR Grand Challenge
This work addresses speech recognition in noisy car environments, but it is incremental as it builds on existing challenge frameworks.
The paper tackled in-car multi-channel automatic speech recognition by developing systems with front-end enhancement, diarization, data augmentation, and multi-channel modeling, achieving a 34.3% relative improvement in CER and 56.5% in cpCER over the baseline.
This paper summarizes our team's efforts in both tracks of the ICMC-ASR Challenge for in-car multi-channel automatic speech recognition. Our submitted systems for ICMC-ASR Challenge include the multi-channel front-end enhancement and diarization, training data augmentation, speech recognition modeling with multi-channel branches. Tested on the offical Eval1 and Eval2 set, our best system achieves a relative 34.3% improvement in CER and 56.5% improvement in cpCER, compared to the offical baseline system.