CVSep 8, 2017

DeepFeat: A Bottom Up and Top Down Saliency Model Based on Deep Features of Convolutional Neural Nets

arXiv:1709.02495v15.024 citations

Originality Incremental advance

AI Analysis

This work addresses the limitation of traditional saliency models by providing a more effective method for predicting human fixations, which is incremental as it builds on existing deep learning techniques.

The authors tackled the problem of predicting human visual attention by developing DeepFeat, a saliency model that combines bottom-up and top-down approaches using deep features from convolutional neural networks, and it outperformed nine state-of-the-art models across four evaluation metrics.

A deep feature based saliency model (DeepFeat) is developed to leverage the understanding of the prediction of human fixations. Traditional saliency models often predict the human visual attention relying on few level image cues. Although such models predict fixations on a variety of image complexities, their approaches are limited to the incorporated features. In this study, we aim to provide an intuitive interpretation of convolu- tional neural network deep features by combining low and high level visual factors. We exploit four evaluation metrics to evaluate the correspondence between the proposed framework and the ground-truth fixations. The key findings of the results demon- strate that the DeepFeat algorithm, incorporation of bottom up and top down saliency maps, outperforms the individual bottom up and top down approach. Moreover, in comparison to nine 9 state-of-the-art saliency models, our proposed DeepFeat model achieves satisfactory performance based on all four evaluation metrics.

View on arXiv PDF

Similar