A Multi-task Neural Approach for Emotion Attribution, Classification and Summarization
This work addresses emotion analysis in videos, which is important for applications like content recommendation, but it appears incremental as it builds on existing neural methods with a novel integration for a specific bottleneck.
The paper tackled the sparsity of emotional expressions in user-generated videos by proposing BEAC-Net, a multi-task neural framework for emotion attribution, classification, and summarization, which demonstrated superior performance on two video datasets.
Emotional content is a crucial ingredient in user-generated videos. However, the sparsity of emotional expressions in the videos poses an obstacle to visual emotion analysis. In this paper, we propose a new neural approach, Bi-stream Emotion Attribution-Classification Network (BEAC-Net), to solve three related emotion analysis tasks: emotion recognition, emotion attribution, and emotion-oriented summarization, in a single integrated framework. BEAC-Net has two major constituents, an attribution network and a classification network. The attribution network extracts the main emotional segment that classification should focus on in order to mitigate the sparsity issue. The classification network utilizes both the extracted segment and the original video in a bi-stream architecture. We contribute a new dataset for the emotion attribution task with human-annotated ground-truth labels for emotion segments. Experiments on two video datasets demonstrate superior performance of the proposed framework and the complementary nature of the dual classification streams.