CL ASJun 3, 2025

A Multi-Dialectal Dataset for German Dialect ASR and Dialect-to-Standard Speech Translation

Verena Blaschke, Miriam Winkler, Constantin Förster, Gabriele Wenger-Glemser, Barbara Plank

arXiv:2506.02894v110.98 citationsh-index: 7INTERSPEECH

Originality Synthesis-oriented

AI Analysis

This provides a dataset for studying ASR robustness to dialectal variation in German, which is incremental as it builds on existing multilingual ASR methods.

The authors tackled the underrepresentation of German dialects in ASR research by creating Betthupferl, a 4-hour evaluation dataset with three dialect groups and Standard German transcriptions, finding that state-of-the-art ASR models show varying output resemblance to dialectal vs. standardized transcriptions.

Although Germany has a diverse landscape of dialects, they are underrepresented in current automatic speech recognition (ASR) research. To enable studies of how robust models are towards dialectal variation, we present Betthupferl, an evaluation dataset containing four hours of read speech in three dialect groups spoken in Southeast Germany (Franconian, Bavarian, Alemannic), and half an hour of Standard German speech. We provide both dialectal and Standard German transcriptions, and analyze the linguistic differences between them. We benchmark several multilingual state-of-the-art ASR models on speech translation into Standard German, and find differences between how much the output resembles the dialectal vs. standardized transcriptions. Qualitative error analyses of the best ASR model reveal that it sometimes normalizes grammatical differences, but often stays closer to the dialectal constructions.

View on arXiv PDF

Similar