CVGRMMMay 4, 2017

Deep 360 Pilot: Learning a Deep Agent for Piloting through 360° Sports Video

arXiv:1705.01759v1186 citations
Originality Incremental advance
AI Analysis

This addresses the viewer burden in immersive sports video consumption, but it is incremental as it builds on existing policy gradient methods for a specific domain.

The paper tackles the problem of automatically selecting viewing angles in 360° sports videos to relieve viewers from manual piloting, achieving the best performance in accuracy and smoothness compared to baselines on a new dataset.

Watching a 360° sports video requires a viewer to continuously select a viewing angle, either through a sequence of mouse clicks or head movements. To relieve the viewer from this "360 piloting" task, we propose "deep 360 pilot" -- a deep learning-based agent for piloting through 360° sports videos automatically. At each frame, the agent observes a panoramic image and has the knowledge of previously selected viewing angles. The task of the agent is to shift the current viewing angle (i.e. action) to the next preferred one (i.e., goal). We propose to directly learn an online policy of the agent from data. We use the policy gradient technique to jointly train our pipeline: by minimizing (1) a regression loss measuring the distance between the selected and ground truth viewing angles, (2) a smoothness loss encouraging smooth transition in viewing angle, and (3) maximizing an expected reward of focusing on a foreground object. To evaluate our method, we build a new 360-Sports video dataset consisting of five sports domains. We train domain-specific agents and achieve the best performance on viewing angle selection accuracy and transition smoothness compared to [51] and other baselines.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes