CL AI LG AS SPJun 11, 2024

Reading Miscue Detection in Primary School through Automatic Speech Recognition

Lingyun Gao, Cristian Tejedor-Garcia, Helmer Strik, Catia Cucchiarini

arXiv:2406.07060v11.94 citationsh-index: 38

Originality Incremental advance

AI Analysis

This research addresses the limited availability of ASR-based reading diagnosis systems for non-English child speech, offering incremental improvements in efficiency for teachers and students in primary education.

This study tackled the problem of detecting reading miscues in primary school children using automatic speech recognition (ASR) for Dutch child speech, finding that Hubert Large achieved state-of-the-art phoneme-level recognition with a PER of 23.1%, and Whisper achieved state-of-the-art word-level performance with a WER of 9.8%, with Wav2Vec2 Large showing the highest recall of 0.83 and Whisper the highest precision and F1 score of 0.52 for miscue detection.

Automatic reading diagnosis systems can benefit both teachers for more efficient scoring of reading exercises and students for accessing reading exercises with feedback more easily. However, there are limited studies on Automatic Speech Recognition (ASR) for child speech in languages other than English, and limited research on ASR-based reading diagnosis systems. This study investigates how efficiently state-of-the-art (SOTA) pretrained ASR models recognize Dutch native children speech and manage to detect reading miscues. We found that Hubert Large finetuned on Dutch speech achieves SOTA phoneme-level child speech recognition (PER at 23.1\%), while Whisper (Faster Whisper Large-v2) achieves SOTA word-level performance (WER at 9.8\%). Our findings suggest that Wav2Vec2 Large and Whisper are the two best ASR models for reading miscue detection. Specifically, Wav2Vec2 Large shows the highest recall at 0.83, whereas Whisper exhibits the highest precision at 0.52 and an F1 score of 0.52.

View on arXiv PDF

Similar