SDAIASJul 1, 2024

Are you sure? Analysing Uncertainty Quantification Approaches for Real-world Speech Emotion Recognition

arXiv:2407.01143v17 citationsh-index: 52
Originality Synthesis-oriented
AI Analysis

This work addresses reliability issues in speech emotion recognition for real-world applications, but it is incremental as it evaluates existing methods rather than introducing new ones.

The paper tackles the problem of uncertainty quantification in speech emotion recognition under real-world challenges like corrupted signals and absence of speech, showing that simple methods can indicate uncertainty and training with out-of-distribution data improves identification.

Uncertainty Quantification (UQ) is an important building block for the reliable use of neural networks in real-world scenarios, as it can be a useful tool in identifying faulty predictions. Speech emotion recognition (SER) models can suffer from particularly many sources of uncertainty, such as the ambiguity of emotions, Out-of-Distribution (OOD) data or, in general, poor recording conditions. Reliable UQ methods are thus of particular interest as in many SER applications no prediction is better than a faulty prediction. While the effects of label ambiguity on uncertainty are well documented in the literature, we focus our work on an evaluation of UQ methods for SER under common challenges in real-world application, such as corrupted signals, and the absence of speech. We show that simple UQ methods can already give an indication of the uncertainty of a prediction and that training with additional OOD data can greatly improve the identification of such signals.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes