Implicit Saliency in Deep Neural Networks
This addresses the challenge of saliency detection for computer vision applications, offering an unsupervised approach that is incremental in leveraging existing architectures.
The paper tackles the problem of predicting human visual saliency without using eye-tracking data, showing that deep neural networks can achieve comparable performance to state-of-the-art supervised algorithms and demonstrate greater robustness to noise.
In this paper, we show that existing recognition and localization deep architectures, that have not been exposed to eye tracking data or any saliency datasets, are capable of predicting the human visual saliency. We term this as implicit saliency in deep neural networks. We calculate this implicit saliency using expectancy-mismatch hypothesis in an unsupervised fashion. Our experiments show that extracting saliency in this fashion provides comparable performance when measured against the state-of-art supervised algorithms. Additionally, the robustness outperforms those algorithms when we add large noise to the input images. Also, we show that semantic features contribute more than low-level features for human visual saliency detection.