Affect Intensity Estimation Using Multiple Modalities
This work addresses the problem of precise emotion intensity estimation for affect recognition applications, but it appears incremental as it builds on existing multimodal approaches without introducing a major breakthrough.
The research tackled the challenge of accurately estimating emotion intensity levels by developing a model based on weighted sums of classification confidence, feature point displacement, and motion speed using data from face, body posture, hand movement, and speech modalities. Results showed that speech and hand modalities significantly improved accuracy in emotion intensity estimation along an arousal scale from 0 to 1.
One of the challenges in affect recognition is accurate estimation of the emotion intensity level. This research proposes development of an affect intensity estimation model based on a weighted sum of classification confidence levels, displacement of feature points and speed of feature point motion. The parameters of the model were calculated from data captured using multiple modalities such as face, body posture, hand movement and speech. A preliminary study was conducted to compare the accuracy of the model with the annotated intensity levels. An emotion intensity scale ranging from 0 to 1 along the arousal dimension in the emotion space was used. Results indicated speech and hand modality significantly contributed in improving accuracy in emotion intensity estimation using the proposed model.