CVSep 5, 2016

A Deep Multi-Level Network for Saliency Prediction

arXiv:1609.01064v2374 citations
AI Analysis

This addresses the problem of improving visual attention modeling for computer vision applications, representing an incremental advance in saliency prediction methods.

The paper tackles saliency prediction by proposing a deep multi-level network that combines features from different CNN layers, outperforming state-of-the-art models on the SALICON dataset and achieving competitive results on MIT300.

This paper presents a novel deep architecture for saliency prediction. Current state of the art models for saliency prediction employ Fully Convolutional networks that perform a non-linear combination of features extracted from the last convolutional layer to predict saliency maps. We propose an architecture which, instead, combines features extracted at different levels of a Convolutional Neural Network (CNN). Our model is composed of three main blocks: a feature extraction CNN, a feature encoding network, that weights low and high level feature maps, and a prior learning network. We compare our solution with state of the art saliency models on two public benchmarks datasets. Results show that our model outperforms under all evaluation metrics on the SALICON dataset, which is currently the largest public dataset for saliency prediction, and achieves competitive results on the MIT300 benchmark.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes