Deep Impression: Audiovisual Deep Residual Networks for Multimodal Apparent Personality Trait Recognition
This work addresses the challenge of automated personality trait recognition from audiovisual data, which is incremental as it builds on existing deep learning methods for multimodal tasks.
The authors tackled the problem of recognizing apparent personality traits from videos using an audiovisual deep residual network, achieving a test accuracy of 0.9109 and third place in the ChaLearn First Impressions Challenge.
Here, we develop an audiovisual deep residual network for multimodal apparent personality trait recognition. The network is trained end-to-end for predicting the Big Five personality traits of people from their videos. That is, the network does not require any feature engineering or visual analysis such as face detection, face landmark alignment or facial expression recognition. Recently, the network won the third place in the ChaLearn First Impressions Challenge with a test accuracy of 0.9109.