CLAICVLGMMJul 26, 2017

Video Highlight Prediction Using Audience Chat Reactions

arXiv:1707.08559v11092 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of multimodal, multilingual analysis for sports video portals, though it appears incremental as it builds on existing CNN-RNN architectures.

The paper tackled the problem of automatically predicting video highlights in sports by analyzing joint visual features and textual audience chat reactions in English and traditional Chinese, achieving strong results on a novel dataset from League of Legends championships.

Sports channel video portals offer an exciting domain for research on multimodal, multilingual analysis. We present methods addressing the problem of automatic video highlight prediction based on joint visual features and textual analysis of the real-world audience discourse with complex slang, in both English and traditional Chinese. We present a novel dataset based on League of Legends championships recorded from North American and Taiwanese Twitch.tv channels (will be released for further research), and demonstrate strong results on these using multimodal, character-level CNN-RNN model architectures.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes