CL SD ASOct 12, 2022

Summary on the ISCSLP 2022 Chinese-English Code-Switching ASR Challenge

Shuhao Deng, Chengfei Li, Jinfeng Bai, Qingqing Zhang, Wei-Qiang Zhang, Runyan Yang, Gaofeng Cheng, Pengyuan Zhang, Yonghong Yan

arXiv:2210.06091v20.61 citationsh-index: 28

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of code-switching speech recognition for multilingual applications, but it is incremental as it focuses on summarizing a competition rather than introducing new methods.

The paper describes the ISCSLP 2022 Chinese-English Code-Switching ASR Challenge, which tackled the problem of automatic speech recognition in multilingual code-switching scenarios, resulting in the winning team achieving a 16.70% Mixture Error Rate and a 9.8% absolute improvement over the baseline.

Code-switching automatic speech recognition becomes one of the most challenging and the most valuable scenarios of automatic speech recognition, due to the code-switching phenomenon between multilingual language and the frequent occurrence of code-switching phenomenon in daily life. The ISCSLP 2022 Chinese-English Code-Switching Automatic Speech Recognition (CSASR) Challenge aims to promote the development of code-switching automatic speech recognition. The ISCSLP 2022 CSASR challenge provided two training sets, TAL_CSASR corpus and MagicData-RAMC corpus, a development and a test set for participants, which are used for CSASR model training and evaluation. Along with the challenge, we also provide the baseline system performance for reference. As a result, more than 40 teams participated in this challenge, and the winner team achieved 16.70% Mixture Error Rate (MER) performance on the test set and has achieved 9.8% MER absolute improvement compared with the baseline system. In this paper, we will describe the datasets, the associated baselines system and the requirements, and summarize the CSASR challenge results and major techniques and tricks used in the submitted systems.

View on arXiv PDF

Similar