CLCYSDASJun 19, 2025

Automatic Speech Recognition Biases in Newcastle English: an Error Analysis

arXiv:2506.16558v15 citationsh-index: 1INTERSPEECH
Originality Synthesis-oriented
AI Analysis

This addresses regional bias in ASR for users of Newcastle English, which is an incremental contribution as it builds on prior bias research.

The study investigated ASR performance on Newcastle English, finding that errors directly correlate with regional dialectal features, with social factors playing a lesser role, and identified key phonological, lexical, and morphosyntactic errors behind misrecognitions.

Automatic Speech Recognition (ASR) systems struggle with regional dialects due to biased training which favours mainstream varieties. While previous research has identified racial, age, and gender biases in ASR, regional bias remains underexamined. This study investigates ASR performance on Newcastle English, a well-documented regional dialect known to be challenging for ASR. A two-stage analysis was conducted: first, a manual error analysis on a subsample identified key phonological, lexical, and morphosyntactic errors behind ASR misrecognitions; second, a case study focused on the systematic analysis of ASR recognition of the regional pronouns ``yous'' and ``wor''. Results show that ASR errors directly correlate with regional dialectal features, while social factors play a lesser role in ASR mismatches. We advocate for greater dialectal diversity in ASR training data and highlight the value of sociolinguistic analysis in diagnosing and addressing regional biases.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes