AS CL LG SDJun 2, 2025

Confidence intervals for forced alignment boundaries using model ensembles

arXiv:2506.01256v11.2Has Code

Originality Incremental advance

AI Analysis

This provides researchers with uncertainty estimates for alignment boundaries, enabling better diagnostics and analysis, though it is incremental as it builds on existing ensemble techniques.

The paper tackles the problem of forced alignment tools providing only single boundary estimates by introducing a method to derive confidence intervals using neural network ensembles, resulting in a slight improvement over single models on the Buckeye and TIMIT corpora.

Forced alignment is a common tool to align audio with orthographic and phonetic transcriptions. Most forced alignment tools provide only a single estimate of a boundary. The present project introduces a method of deriving confidence intervals for these boundaries using a neural network ensemble technique. Ten different segment classifier neural networks were previously trained, and the alignment process is repeated with each model. The alignment ensemble is then used to place the boundary at the median of the boundaries in the ensemble, and 97.85% confidence intervals are constructed using order statistics. On the Buckeye and TIMIT corpora, the ensemble boundaries show a slight improvement over using just a single model. The confidence intervals are incorporated into Praat TextGrids using a point tier, and they are also output as a table for researchers to analyze separately as diagnostics or to incorporate uncertainty into their analyses.

View on arXiv PDF Code

Similar