CL SD ASAug 20, 2025

EmoTale: An Enacted Speech-emotion Dataset in Danish

Maja J. Hjuler, Harald V. Skat-Rørdam, Line H. Clemmensen, Sneha Das

arXiv:2508.14548v14.92 citationsh-index: 7

Originality Synthesis-oriented

AI Analysis

This provides a functional dataset for speech emotion recognition in Danish, addressing a gap for smaller languages, but it is incremental as it builds on existing methods and datasets.

The authors tackled the lack of emotional speech datasets for Danish by creating EmoTale, a new corpus with Danish and English recordings and emotion annotations, and demonstrated its validity by achieving a 64.1% unweighted average recall in speech emotion recognition models, comparable to existing datasets.

While multiple emotional speech corpora exist for commonly spoken languages, there is a lack of functional datasets for smaller (spoken) languages, such as Danish. To our knowledge, Danish Emotional Speech (DES), published in 1997, is the only other database of Danish emotional speech. We present EmoTale; a corpus comprising Danish and English speech recordings with their associated enacted emotion annotations. We demonstrate the validity of the dataset by investigating and presenting its predictive power using speech emotion recognition (SER) models. We develop SER models for EmoTale and the reference datasets using self-supervised speech model (SSLM) embeddings and the openSMILE feature extractor. We find the embeddings superior to the hand-crafted features. The best model achieves an unweighted average recall (UAR) of 64.1% on the EmoTale corpus using leave-one-speaker-out cross-validation, comparable to the performance on DES.

View on arXiv PDF

Similar