CVDec 18, 2019

Salient Object Detection with Purificatory Mechanism and Structural Similarity Loss

arXiv:1912.08393v268 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of improving salient object detection for computer vision applications, representing an incremental advancement by refining attention mechanisms to handle error-prone regions.

The paper tackles the problem of accurately locating salient objects in images, particularly in regions with indistinguishable foreground/background or complex structures, by proposing a convolutional neural network with a purificatory mechanism and structural similarity loss, which outperforms 19 state-of-the-art methods on six datasets with over 27 FPS on a single GPU.

By the aid of attention mechanisms to weight the image features adaptively, recent advanced deep learning-based models encourage the predicted results to approximate the ground-truth masks with as large predictable areas as possible, thus achieving the state-of-the-art performance. However, these methods do not pay enough attention to small areas prone to misprediction. In this way, it is still tough to accurately locate salient objects due to the existence of regions with indistinguishable foreground and background and regions with complex or fine structures. To address these problems, we propose a novel convolutional neural network with purificatory mechanism and structural similarity loss. Specifically, in order to better locate preliminary salient objects, we first introduce the promotion attention, which is based on spatial and channel attention mechanisms to promote attention to salient regions. Subsequently, for the purpose of restoring the indistinguishable regions that can be regarded as error-prone regions of one model, we propose the rectification attention, which is learned from the areas of wrong prediction and guide the network to focus on error-prone regions thus rectifying errors. Through these two attentions, we use the Purificatory Mechanism to impose strict weights with different regions of the whole salient objects and purify results from hard-to-distinguish regions, thus accurately predicting the locations and details of salient objects. In addition to paying different attention to these hard-to-distinguish regions, we also consider the structural constraints on complex regions and propose the Structural Similarity Loss. In experiments, the proposed approach outperforms 19 state-of-the-art methods on six datasets with a notable margin at over 27FPS on a single NVIDIA 1080Ti GPU.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes