CLSDASApr 3, 2021

speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment

arXiv:2104.01378v2127 citationsHas Code
AI Analysis

This addresses the need for accessible data for pronunciation assessment in non-native English speakers, particularly children, but is incremental as it primarily offers a new dataset.

The paper introduces speechocean762, an open-source corpus of 5000 English utterances from 250 non-native speakers, including children, with expert annotations at multiple levels, and provides a baseline system for pronunciation assessment.

This paper introduces a new open-source speech corpus named "speechocean762" designed for pronunciation assessment use, consisting of 5000 English utterances from 250 non-native speakers, where half of the speakers are children. Five experts annotated each of the utterances at sentence-level, word-level and phoneme-level. A baseline system is released in open source to illustrate the phoneme-level pronunciation assessment workflow on this corpus. This corpus is allowed to be used freely for commercial and non-commercial purposes. It is available for free download from OpenSLR, and the corresponding baseline system is published in the Kaldi speech recognition toolkit.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes