CVJun 4, 2018

Cube Padding for Weakly-Supervised Saliency Prediction in 360° Videos

arXiv:1806.01320v1215 citations
Originality Incremental advance
AI Analysis

This work addresses a critical need for viewpoint guidance applications like Facebook 360 Guide by improving saliency prediction in 360° videos, though it is incremental as it builds on existing CNN structures with a novel padding method.

The paper tackles the problem of automatic saliency prediction in 360° videos by proposing a weakly-supervised spatial-temporal network with a Cube Padding technique to reduce distortion and boundary issues, achieving superior speed and quality compared to baseline methods.

Automatic saliency prediction in 360° videos is critical for viewpoint guidance applications (e.g., Facebook 360 Guide). We propose a spatial-temporal network which is (1) weakly-supervised trained and (2) tailor-made for 360° viewing sphere. Note that most existing methods are less scalable since they rely on annotated saliency map for training. Most importantly, they convert 360° sphere to 2D images (e.g., a single equirectangular image or multiple separate Normal Field-of-View (NFoV) images) which introduces distortion and image boundaries. In contrast, we propose a simple and effective Cube Padding (CP) technique as follows. Firstly, we render the 360° view on six faces of a cube using perspective projection. Thus, it introduces very little distortion. Then, we concatenate all six faces while utilizing the connectivity between faces on the cube for image padding (i.e., Cube Padding) in convolution, pooling, convolutional LSTM layers. In this way, CP introduces no image boundary while being applicable to almost all Convolutional Neural Network (CNN) structures. To evaluate our method, we propose Wild-360, a new 360° video saliency dataset, containing challenging videos with saliency heatmap annotations. In experiments, our method outperforms baseline methods in both speed and quality.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes