Gholam Ali Montazer

HCNov 21, 2021

Customizing an Affective Tutoring System Based on Facial Expression and Head Pose Estimation

Mahdi Pourmirzaei, Gholam Ali Montazer, Ebrahim Mousavi

In recent years, the main problem in e-learning has shifted from analyzing content to personalization of learning environment by Intelligence Tutoring Systems (ITSs). Therefore, by designing personalized teaching models, learners are able to have a successful and satisfying experience in achieving their learning goals. Affective Tutoring Systems (ATSs) are some kinds of ITS that can recognize and respond to affective states of learner. In this study, we designed, implemented, and evaluated a system to personalize the learning environment based on the facial emotions recognition, head pose estimation, and cognitive style of learners. First, a unit called Intelligent Analyzer (AI) created which was responsible for recognizing facial expression and head angles of learners. Next, the ATS was built which mainly made of two units: ITS, IA. Results indicated that with the ATS, participants needed less efforts to pass the tests. In other words, we observed when the IA unit was activated, learners could pass the final tests in fewer attempts than those for whom the IA unit was deactivated. Additionally, they showed an improvement in terms of the mean passing score and academic satisfaction.

CVMay 13, 2021

Using Self-Supervised Auxiliary Tasks to Improve Fine-Grained Facial Representation

Mahdi Pourmirzaei, Gholam Ali Montazer, Farzaneh Esmaili

Facial emotion recognition (FER) is a fine-grained problem where the value of transfer learning is often assumed. We first quantify this assumption and show that, on AffectNet, training from random initialization with sufficiently strong augmentation consistently matches or surpasses fine-tuning from ImageNet. Motivated by this result, we propose Hybrid Multi-Task Learning (HMTL) for FER in the wild. HMTL augments supervised learning (SL) with self-supervised learning (SSL) objectives during training, while keeping the inference-time model unchanged. We instantiate HMTL with two tailored pretext tasks, puzzling and inpainting with a perceptual loss, that encourage part-aware and expression-relevant features. On AffectNet, both HMTL variants achieve state-of-the-art accuracy in the eight-emotion setting without any additional pretraining data, and they provide larger gains under low-data regimes. Compared with conventional SSL pretraining, HMTL yields stronger downstream performance. Beyond FER, the same strategy improves fine-grained facial analysis tasks, including head pose estimation and gender recognition. These results suggest that aligned SSL auxiliaries are an effective and simple way to strengthen supervised fine-grained facial representation without adding extra computation cost during inference time.

Gholam Ali Montazer

2 Papers