LG MM SD AS IVJan 26, 2021

A Case Study of Deep Learning Based Multi-Modal Methods for Predicting the Age-Suitability Rating of Movie Trailers

Mahsa Shafaei, Christos Smailis, Ioannis A. Kakadiaris, Thamar Solorio

arXiv:2101.11704v13.11 citationsh-index: 51

Originality Incremental advance

AI Analysis

This work addresses the need for automated content rating in media, though it appears incremental as it builds on existing multi-modal methods for a specific domain.

The authors tackled the problem of automated age-suitability rating of movie trailers by introducing a new dataset and a multi-modal deep learning pipeline combining video, audio, and speech information, achieving significant performance improvements over mono and bimodal models.

In this work, we explore different approaches to combine modalities for the problem of automated age-suitability rating of movie trailers. First, we introduce a new dataset containing videos of movie trailers in English downloaded from IMDB and YouTube, along with their corresponding age-suitability rating labels. Secondly, we propose a multi-modal deep learning pipeline addressing the movie trailer age suitability rating problem. This is the first attempt to combine video, audio, and speech information for this problem, and our experimental results show that multi-modal approaches significantly outperform the best mono and bimodal models in this task.

View on arXiv PDF

Similar