CLDec 9, 2020

Normalization of Different Swedish Dialects Spoken in Finland

Mika Hämäläinen, Niko Partanen, Khalid Alnajjar

arXiv:2012.05318v19 citations

AI Analysis

This research provides important baselines and insights into the adaptability of dialect normalization methods for researchers working with Finland Swedish dialects.

This study developed a dialect normalization method for six Finland Swedish dialects, achieving a significant reduction in word error rate from 76.45 to 28.58. The best performance was obtained by training the model with one word at a time, which contrasts with findings from previous research on Finnish dialects.

Our study presents a dialect normalization method for different Finland Swedish dialects covering six regions. We tested 5 different models, and the best model improved the word error rate from 76.45 to 28.58. Contrary to results reported in earlier research on Finnish dialects, we found that training the model with one word at a time gave best results. We believe this is due to the size of the training data available for the model. Our models are accessible as a Python package. The study provides important information about the adaptability of these methods in different contexts, and gives important baselines for further study.

View on arXiv PDF

Similar