Extracting accent features in spoken Brazilian Portuguese without sociolinguistic labels
This work provides a method for more reliable regional accent classification for researchers and applications in Brazilian Portuguese, addressing the problem of unreliable sociolinguistic labels.
This paper addresses the challenge of regional accent classification in Brazilian Portuguese by proposing a novel workflow that extracts accent features using only acoustic labels, circumventing the need for unreliable sociolinguistic labels. The method isolates explicit regional accent landmarks and employs a phoneme-based forced aligner, resulting in a targeted feature set that more effectively captures dialectal variance compared to utterance embeddings.
Regional accent classification in Brazilian Portuguese (pt-BR) suffers from the need for reliable labeling. While large self-supervised learning (SSL) speech models are powerful, their training pipelines dilute sociophonetic information, since accent labels are generally not reliable or are not used in training objectives. This work introduces a novel workflow for feature extraction using only acoustic labels. By isolating explicit regional accent landmarks and using a phoneme-based forced aligner (ZIPA), our targeted feature set captures dialectal variance more effectively than utterance embeddings, demonstrating that localized features can outperform general-purpose architectures on accent-related tasks using minimal and objective data labels.