SD ASNov 13, 2020

The SLT 2021 children speech recognition challenge: Open datasets, rules and baselines

Fan Yu, Zhuoyuan Yao, Xiong Wang, Keyu An, Lei Xie, Zhijian Ou, Bo Liu, Xiulin Li, Guanqiong Miao

arXiv:2011.06724v212.426 citations

Originality Synthesis-oriented

AI Analysis

This work tackles the problem of limited data and performance gaps in children's speech recognition for the research community, but it is incremental as it focuses on dataset release and benchmarking rather than novel methods.

The paper introduces the Children Speech Recognition Challenge (CSRC) to address the lack of open datasets and poor performance in children's speech recognition, releasing about 400 hours of Mandarin speech data and providing baselines for benchmarking.

Automatic speech recognition (ASR) has been significantly advanced with the use of deep learning and big data. However improving robustness, including achieving equally good performance on diverse speakers and accents, is still a challenging problem. In particular, the performance of children speech recognition (CSR) still lags behind due to 1) the speech and language characteristics of children's voice are substantially different from those of adults and 2) sizable open dataset for children speech is still not available in the research community. To address these problems, we launch the Children Speech Recognition Challenge (CSRC), as a flagship satellite event of IEEE SLT 2021 workshop. The challenge will release about 400 hours of Mandarin speech data for registered teams and set up two challenge tracks and provide a common testbed to benchmark the CSR performance. In this paper, we introduce the datasets, rules, evaluation method as well as baselines.

View on arXiv PDF

Similar