CVFeb 2, 2017

A Fast and Compact Saliency Score Regression Network Based on Fully Convolutional Network

arXiv:1702.00615v26 citations
AI Analysis

This work addresses the need for efficient saliency detection in computer vision applications, though it is incremental as it builds on existing deep learning approaches.

The authors tackled the problem of slow and complex visual saliency detection methods by proposing a fast and compact saliency score regression network based on a fully convolutional network, achieving comparable or better precision than state-of-the-art methods with a significant speed improvement of 35 FPS for real-time processing.

Visual saliency detection aims at identifying the most visually distinctive parts in an image, and serves as a pre-processing step for a variety of computer vision and image processing tasks. To this end, the saliency detection procedure must be as fast and compact as possible and optimally processes input images in a real time manner. It is an essential application requirement for the saliency detection task. However, contemporary detection methods often utilize some complicated procedures to pursue feeble improvements on the detection precession, which always take hundreds of milliseconds and make them not easy to be applied practically. In this paper, we tackle this problem by proposing a fast and compact saliency score regression network which employs fully convolutional network, a special deep convolutional neural network, to estimate the saliency of objects in images. It is an extremely simplified end-to-end deep neural network without any pre-processings and post-processings. When given an image, the network can directly predict a dense full-resolution saliency map (image-to-image prediction). It works like a compact pipeline which effectively simplifies the detection procedure. Our method is evaluated on six public datasets, and experimental results show that it can achieve comparable or better precision performance than the state-of-the-art methods while get a significant improvement in detection speed (35 FPS, processing in real time).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes