CVMar 16, 2023
Facial Affective Behavior Analysis Method for 5th ABAW CompetitionShangfei Wang, Yanan Chang, Yi Wu et al.
Facial affective behavior analysis is important for human-computer interaction. 5th ABAW competition includes three challenges from Aff-Wild2 database. Three common facial affective analysis tasks are involved, i.e. valence-arousal estimation, expression classification, action unit recognition. For the three challenges, we construct three different models to solve the corresponding problems to improve the results, such as data unbalance and data noise. For the experiments of three challenges, we train the models on the provided training data and validate the models on the validation data.
AIDec 2, 2025
Empathy Level Prediction in Multi-Modal Scenario with Supervisory Documentation AssistanceYufei Xiao, Shangfei Wang
Prevalent empathy prediction techniques primarily concentrate on a singular modality, typically textual, thus neglecting multi-modal processing capabilities. They also overlook the utilization of certain privileged information, which may encompass additional empathetic content. In response, we introduce an advanced multi-modal empathy prediction method integrating video, audio, and text information. The method comprises the Multi-Modal Empathy Prediction and Supervisory Documentation Assisted Training. We use pre-trained networks in the empathy prediction network to extract features from various modalities, followed by a cross-modal fusion. This process yields a multi-modal feature representation, which is employed to predict empathy labels. To enhance the extraction of text features, we incorporate supervisory documents as privileged information during the assisted training phase. Specifically, we apply the Latent Dirichlet Allocation model to identify potential topic distributions to constrain text features. These supervisory documents, created by supervisors, focus on the counseling topics and the counselor's display of empathy. Notably, this privileged information is only available during training and is not accessible during the prediction phase. Experimental results on the multi-modal and dialogue empathy datasets demonstrate that our approach is superior to the existing methods.
CVMay 4, 2023
MEDIC: A Multimodal Empathy Dataset in CounselingZhou'an_Zhu, Xin Li, Jicai Pan et al.
Although empathic interaction between counselor and client is fundamental to success in the psychotherapeutic process, there are currently few datasets to aid a computational approach to empathy understanding. In this paper, we construct a multimodal empathy dataset collected from face-to-face psychological counseling sessions. The dataset consists of 771 video clips. We also propose three labels (i.e., expression of experience, emotional reaction, and cognitive reaction) to describe the degree of empathy between counselors and their clients. Expression of experience describes whether the client has expressed experiences that can trigger empathy, and emotional and cognitive reactions indicate the counselor's empathic reactions. As an elementary assessment of the usability of the constructed multimodal empathy dataset, an interrater reliability analysis of annotators' subjective evaluations for video clips is conducted using the intraclass correlation coefficient and Fleiss' Kappa. Results prove that our data annotation is reliable. Furthermore, we conduct empathy prediction using three typical methods, including the tensor fusion network, the sentimental words aware fusion network, and a simple concatenation model. The experimental results show that empathy can be well predicted on our dataset. Our dataset is available for research purposes.