Spatiotemporal Networks for Video Emotion Recognition
This work addresses emotion recognition from videos, which is an incremental improvement using established techniques.
The paper tackled video emotion recognition by adapting existing deep learning and traditional methods, achieving good results on the AFEW 6.0 dataset.
Our experiment adapts several popular deep learning methods as well as some traditional methods on the problem of video emotion recognition. In our experiment, we use the CNN-LSTM architecture for visual information extraction and classification and utilize traditional methods such as for audio feature classification. For multimodal fusion, we use the traditional Support Vector Machine. Our experiment yields a good result on the AFEW 6.0 Dataset.