VEMOCLAP: A video emotion classification web application
This provides an open-source tool for users to analyze emotional content in videos, though it builds incrementally on previous work.
The researchers tackled video emotion classification by developing VEMOCLAP, a web application that uses pretrained features with multi-head cross-attention fusion, achieving a 4.3% increase in state-of-the-art accuracy on the Ekman-6 dataset.
We introduce VEMOCLAP: Video EMOtion Classifier using Pretrained features, the first readily available and open-source web application that analyzes the emotional content of any user-provided video. We improve our previous work, which exploits open-source pretrained models that work on video frames and audio, and then efficiently fuse the resulting pretrained features using multi-head cross-attention. Our approach increases the state-of-the-art classification accuracy on the Ekman-6 video emotion dataset by 4.3% and offers an online application for users to run our model on their own videos or YouTube videos. We invite the readers to try our application at serkansulun.com/app.