CVSep 19, 2017

SalNet360: Saliency Maps for omni-directional images with CNN

Rafael Monroy, Sebastian Lutz, Tejo Chalasani, Aljosa Smolic

arXiv:1709.06505v218.1160 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the need for saliency prediction in VR content, which is useful for content creators and encoding algorithms, but it is incremental as it adapts existing techniques to a new media type.

The authors tackled the problem of predicting visual attention in omnidirectional images by extending any CNN to fine-tune 2D saliency prediction in an end-to-end manner, showing that each step improves accuracy relative to ground truth data.

The prediction of Visual Attention data from any kind of media is of valuable use to content creators and used to efficiently drive encoding algorithms. With the current trend in the Virtual Reality (VR) field, adapting known techniques to this new kind of media is starting to gain momentum. In this paper, we present an architectural extension to any Convolutional Neural Network (CNN) to fine-tune traditional 2D saliency prediction to Omnidirectional Images (ODIs) in an end-to-end manner. We show that each step in the proposed pipeline works towards making the generated saliency map more accurate with respect to ground truth data.

View on arXiv PDF

Similar