CLSDASMar 28, 2024

Phonetic Segmentation of the UCLA Phonetics Lab Archive

arXiv:2403.19509v186 citationsh-index: 3LREC
Originality Synthesis-oriented
AI Analysis

This work provides a more accessible and detailed dataset for researchers in phonetics, linguistics, and speech technologies, though it is incremental as it builds on existing data.

The researchers tackled the problem of limited accessibility and granularity in the UCLA Phonetics Lab Archive by creating VoxAngeles, a corpus with audited phonetic transcriptions and phone-level alignments for 95 languages, enhancing usability for quantitative phonetic typology and other applications.

Research in speech technologies and comparative linguistics depends on access to diverse and accessible speech data. The UCLA Phonetics Lab Archive is one of the earliest multilingual speech corpora, with long-form audio recordings and phonetic transcriptions for 314 languages (Ladefoged et al., 2009). Recently, 95 of these languages were time-aligned with word-level phonetic transcriptions (Li et al., 2021). Here we present VoxAngeles, a corpus of audited phonetic transcriptions and phone-level alignments of the UCLA Phonetics Lab Archive, which uses the 95-language CMU re-release as our starting point. VoxAngeles also includes word- and phone-level segmentations from the original UCLA corpus, as well as phonetic measurements of word and phone durations, vowel formants, and vowel f0. This corpus enhances the usability of the original data, particularly for quantitative phonetic typology, as demonstrated through a case study of vowel intrinsic f0. We also discuss the utility of the VoxAngeles corpus for general research and pedagogy in crosslinguistic phonetics, as well as for low-resource and multilingual speech technologies. VoxAngeles is free to download and use under a CC-BY-NC 4.0 license.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes