MM CLMay 4, 2021

An Estimation of Online Video User Engagement from Features of Continuous Emotions

Lukas Stappen, Alice Baird, Michelle Lienhart, Annalena Bätz, Björn Schuller

arXiv:2105.01633v13.32 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of predicting user engagement for video content creators and platforms, but it is incremental as it builds on existing methods with new data and feature analysis.

The study tackled the problem of understanding how emotional signals in videos affect online user engagement by analyzing a large dataset of YouTube review videos with emotional annotations. It found that smaller boundary ranges and fluctuations in arousal increase engagement, and feature selection methods improved prediction accuracy, reducing mean absolute error for likes per day from 1.55 to 1.23.

Portraying emotion and trustworthiness is known to increase the appeal of video content. However, the causal relationship between these signals and online user engagement is not well understood. This limited understanding is partly due to a scarcity in emotionally annotated data and the varied modalities which express user engagement online. In this contribution, we utilise a large dataset of YouTube review videos which includes ca. 600 hours of dimensional arousal, valence and trustworthiness annotations. We investigate features extracted from these signals against various user engagement indicators including views, like/dislike ratio, as well as the sentiment of comments. In doing so, we identify the positive and negative influences which single features have, as well as interpretable patterns in each dimension which relate to user engagement. Our results demonstrate that smaller boundary ranges and fluctuations for arousal lead to an increase in user engagement. Furthermore, the extracted time-series features reveal significant (p<0.05) correlations for each dimension, such as, count below signal mean (arousal), number of peaks (valence), and absolute energy (trustworthiness). From this, an effective combination of features is outlined for approaches aiming to automatically predict several user engagement indicators. In a user engagement prediction paradigm we compare all features against semi-automatic (cross-task), and automatic (task-specific) feature selection methods. These selected feature sets appear to outperform the usage of all features, e.g., using all features achieves 1.55 likes per day (Lp/d) mean absolute error from valence; this improves through semi-automatic and automatic selection to 1.33 and 1.23 Lp/d, respectively (data mean 9.72 Lp/d with a std. 28.75 Lp/d).

View on arXiv PDF

Similar