CVNov 29, 2019

CAGNet: Content-Aware Guidance for Salient Object Detection

arXiv:1911.13168v2115 citations
Originality Highly original
AI Analysis

This work addresses salient object detection for computer vision applications, offering an incremental improvement with a novel method for known bottlenecks.

The paper tackles the challenge of detecting salient objects in complex scenarios where non-salient regions appear salient and salient objects have varied appearances, by proposing a Feature Guide Network and Multi-scale Feature Extraction Module, achieving state-of-the-art performance on five datasets with real-time speed of 28 FPS.

Beneficial from Fully Convolutional Neural Networks (FCNs), saliency detection methods have achieved promising results. However, it is still challenging to learn effective features for detecting salient objects in complicated scenarios, in which i) non-salient regions may have "salient-like" appearance; ii) the salient objects may have different-looking regions. To handle these complex scenarios, we propose a Feature Guide Network which exploits the nature of low-level and high-level features to i) make foreground and background regions more distinct and suppress the non-salient regions which have "salient-like" appearance; ii) assign foreground label to different-looking salient regions. Furthermore, we utilize a Multi-scale Feature Extraction Module (MFEM) for each level of abstraction to obtain multi-scale contextual information. Finally, we design a loss function which outperforms the widely-used Cross-entropy loss. By adopting four different pre-trained models as the backbone, we prove that our method is very general with respect to the choice of the backbone model. Experiments on five challenging datasets demonstrate that our method achieves the state-of-the-art performance in terms of different evaluation metrics. Additionally, our approach contains fewer parameters than the existing ones, does not need any post-processing, and runs fast at a real-time speed of 28 FPS when processing a 480 x 480 image.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes