Bias and Fairness on Multimodal Emotion Detection Algorithms
This work addresses bias and fairness issues in multimodal emotion detection, which is crucial for equitable AI applications in domains like healthcare and social computing, though it is incremental as it extends existing unimodal bias research to multimodal contexts.
The study investigated biases in emotion recognition systems across different modalities and their combinations, finding that text alone had the least bias and contributed most to performance, questioning the value of multimodal approaches when fairness is prioritized.
Numerous studies have shown that machine learning algorithms can latch onto protected attributes such as race and gender and generate predictions that systematically discriminate against one or more groups. To date the majority of bias and fairness research has been on unimodal models. In this work, we explore the biases that exist in emotion recognition systems in relationship to the modalities utilized, and study how multimodal approaches affect system bias and fairness. We consider audio, text, and video modalities, as well as all possible multimodal combinations of those, and find that text alone has the least bias, and accounts for the majority of the models' performances, raising doubts about the worthiness of multimodal emotion recognition systems when bias and fairness are desired alongside model performance.