Mohammadreza Kangavari

28.3AIApr 26

Modeling Induced Pleasure through Cognitive Appraisal Prediction via Multimodal Fusion

Nastaran Dab, Raziyeh Zall, Mohammadreza Kangavari

Multimodal affective computing analyzes user-generated social media content to predict emotional states. However, a critical gap remains in understanding how visual content shapes cognitive interpretations and elicits specific affective experiences such as pleasure. This study introduces a novel computational model to infer video-induced pleasure via cognitive appraisal variables. The proposed model addresses four challenges: (1) noisy and inconsistent human labels, (2) the semantic gap between "positive emotions" and "pleasure," (3) the scarcity of pleasure-specific datasets, and (4) the limited interpretability of existing black-box fusion methods. Our approach integrates data-driven and cognitive theory-driven methods, using cognitive appraisal theory and a fuzzy model within an innovative framework. The model employs transformer-based architectures and attention mechanisms for fine-grained multimodal feature extraction and interpretable fusion to capture both inter- and intra-modal dynamics associated with pleasure. This enables the prediction of underlying appraisal variables, thereby bridging the semantic gap and enhancing model explainability beyond conventional statistical associations. Experimental results validate the efficacy of the proposed method in detecting video-induced pleasure, achieving a peak accuracy of 0.6624 in predicting pleasure levels. These findings highlight promising implications for affective content recommendation, intelligent media creation, and advancing our understanding of how digital media influences human emotions.

MMApr 17, 2014

A Novel Approach for Video Temporal Annotation

Hadi Restgou Haghi, Mohammadreza Kangavari, Behrang QasemiZadeh

Recent advances in computing, communication, and data storage have led to an increasing number of large digital libraries publicly available on the Internet. Main problem of content-based video retrieval is inferring semantics from raw video data. Video data play an important role in these libraries. Instead of words, a video retrieval system deals with collections of video records. Therefore, the system is confronted with the problem of video understanding. Because machine understanding of the video data is still an unsolved research problem, text annotations are usually used to describe the content of video data according to the annotator's understanding and the purpose of that video data. Most of proposed systems for video annotation are domain dependent. In addition, in many of these systems, an important feature of video data, temporality, is disregarded. In this paper, we proposed a framework for video temporal annotation. The proposed system uses domain knowledge and a time ontology to perform temporal annotation of input video.

Mohammadreza Kangavari

2 Papers