CVAILGMMIVOct 22, 2024

VEMOCLAP: A video emotion classification web application

arXiv:2410.21303v11 citationsh-index: 5Has CodeISM
Originality Incremental advance
AI Analysis

This provides an open-source tool for users to analyze emotional content in videos, though it builds incrementally on previous work.

The researchers tackled video emotion classification by developing VEMOCLAP, a web application that uses pretrained features with multi-head cross-attention fusion, achieving a 4.3% increase in state-of-the-art accuracy on the Ekman-6 dataset.

We introduce VEMOCLAP: Video EMOtion Classifier using Pretrained features, the first readily available and open-source web application that analyzes the emotional content of any user-provided video. We improve our previous work, which exploits open-source pretrained models that work on video frames and audio, and then efficiently fuse the resulting pretrained features using multi-head cross-attention. Our approach increases the state-of-the-art classification accuracy on the Ekman-6 video emotion dataset by 4.3% and offers an online application for users to run our model on their own videos or YouTube videos. We invite the readers to try our application at serkansulun.com/app.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes