AS CL SDJan 11, 2025

Speech Recognition for Automatically Assessing Afrikaans and isiXhosa Preschool Oral Narratives

Christiaan Jacobs, Annelien Smith, Daleen Klop, Ondřej Klejch, Febe de Wet, Herman Kamper

arXiv:2501.06478v13 citationsh-index: 18ICASSP

Originality Incremental advance

AI Analysis

This work addresses the need for language development assessment tools in under-resourced languages and preschool ages, representing an incremental validation of existing child-speech ASR strategies in a novel setting.

The researchers tackled the problem of automatically assessing language development in Afrikaans and isiXhosa preschool children by developing ASR systems for their oral narratives, finding that using in-domain adult data with voice conversion improved performance, with semi-supervised learning helping for both languages and parameter-efficient fine-tuning effective only for Afrikaans.

We develop automatic speech recognition (ASR) systems for stories told by Afrikaans and isiXhosa preschool children. Oral narratives provide a way to assess children's language development before they learn to read. We consider a range of prior child-speech ASR strategies to determine which is best suited to this unique setting. Using Whisper and only 5 minutes of transcribed in-domain child speech, we find that additional in-domain adult data (adult speech matching the story domain) provides the biggest improvement, especially when coupled with voice conversion. Semi-supervised learning also helps for both languages, while parameter-efficient fine-tuning helps on Afrikaans but not on isiXhosa (which is under-represented in the Whisper model). Few child-speech studies look at non-English data, and even fewer at the preschool ages of 4 and 5. Our work therefore represents a unique validation of a wide range of previous child-speech ASR strategies in an under-explored setting.

View on arXiv PDF

Similar