HCAPAug 31, 2020

Toward Multimodal Modeling of Emotional Expressiveness

arXiv:2009.00001v17 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of automated emotional expressiveness prediction for applications in science, medicine, and industry, but it is incremental as it builds on existing data and methods.

The paper tackled predicting emotional expressiveness from behavioral signals, finding that multimodal models performed best with an RMSE of 0.65 and R^2 of 0.45, and identified key visual and linguistic predictors like facial action unit intensity and word usage.

Emotional expressiveness captures the extent to which a person tends to outwardly display their emotions through behavior. Due to the close relationship between emotional expressiveness and behavioral health, as well as the crucial role that it plays in social interaction, the ability to automatically predict emotional expressiveness stands to spur advances in science, medicine, and industry. In this paper, we explore three related research questions. First, how well can emotional expressiveness be predicted from visual, linguistic, and multimodal behavioral signals? Second, which behavioral modalities are uniquely important to the prediction of emotional expressiveness? Third, which behavioral signals are reliably related to emotional expressiveness? To answer these questions, we add highly reliable transcripts and human ratings of perceived emotional expressiveness to an existing video database and use this data to train, validate, and test predictive models. Our best model shows promising predictive performance on this dataset (RMSE=0.65, R^2=0.45, r=0.74). Multimodal models tend to perform best overall, and models trained on the linguistic modality tend to outperform models trained on the visual modality. Finally, examination of our interpretable models' coefficients reveals a number of visual and linguistic behavioral signals--such as facial action unit intensity, overall word count, and use of words related to social processes--that reliably predict emotional expressiveness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes