Very Long Term Field of View Prediction for 360-degree Video Streaming
This work addresses bandwidth efficiency and user experience in VR/AR streaming, but it is incremental as it builds on existing sequence learning approaches with novel data integration.
The paper tackles the problem of predicting future field of view (FoV) for 360-degree video streaming over long time horizons to save bandwidth and reduce video freezing, achieving significant performance improvements over benchmark models as demonstrated on two public datasets.
360-degree videos have gained increasing popularity in recent years with the developments and advances in Virtual Reality (VR) and Augmented Reality (AR) technologies. In such applications, a user only watches a video scene within a field of view (FoV) centered in a certain direction. Predicting the future FoV in a long time horizon (more than seconds ahead) can help save bandwidth resources in on-demand video streaming while minimizing video freezing in networks with significant bandwidth variations. In this work, we treat the FoV prediction as a sequence learning problem, and propose to predict the target user's future FoV not only based on the user's own past FoV center trajectory but also other users' future FoV locations. We propose multiple prediction models based on two different FoV representations: one using FoV center trajectories and another using equirectangular heatmaps that represent the FoV center distributions. Extensive evaluations with two public datasets demonstrate that the proposed models can significantly outperform benchmark models, and other users' FoVs are very helpful for improving long-term predictions.