CVLGIVMLNov 26, 2019

Revisiting Deep Architectures for Head Motion Prediction in 360° Videos

arXiv:1911.11702v32 citations
Originality Incremental advance
AI Analysis

This work addresses a domain-specific problem for 360° video applications, offering incremental improvements through a refined neural architecture.

The paper tackled head motion prediction in 360° videos by identifying flaws in existing deep-learning methods and designing a new architecture called TRACK, which achieved state-of-the-art performance with up to 20% improvement on focus-type videos for 2-5 second horizons.

We consider predicting the user's head motion in 360-degree videos, with 2 modalities only: the past user's positions and the video content (not knowing other users' traces). We make two main contributions. First, we re-examine existing deep-learning approaches for this problem and identify hidden flaws from a thorough root-cause analysis. Second, from the results of this analysis, we design a new proposal establishing state-of-the-art performance. First, re-assessing the existing methods that use both modalities, we obtain the surprising result that they all perform worse than baselines using the user's trajectory only. A root-cause analysis of the metrics, datasets and neural architectures shows in particular that (i) the content can inform the prediction for horizons longer than 2 to 3 sec. (existing methods consider shorter horizons), and that (ii) to compete with the baselines, it is necessary to have a recurrent unit dedicated to process the positions, but this is not sufficient. Second, from a re-examination of the problem supported with the concept of Structural-RNN, we design a new deep neural architecture, named TRACK. TRACK achieves state-of-the-art performance on all considered datasets and prediction horizons, outperforming competitors by up to 20 percent on focus-type videos and horizons 2-5 seconds. The entire framework (codes and datasets) is online and received an ACM reproducibility badge.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes