CVApr 5, 2021

BTS-Net: Bi-directional Transfer-and-Selection Network For RGB-D Salient Object Detection

Wenbo Zhang, Yao Jiang, Keren Fu, Qijun Zhao

arXiv:2104.01784v112.189 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses a domain-specific issue in computer vision for improving salient object detection with depth data, representing an incremental advance.

The paper tackles the problem of low-quality depth maps in RGB-D salient object detection by proposing BTS-Net, a network with bi-directional interactions to purify encoder features, achieving state-of-the-art results by outperforming 16 methods on six datasets across four metrics.

Depth information has been proved beneficial in RGB-D salient object detection (SOD). However, depth maps obtained often suffer from low quality and inaccuracy. Most existing RGB-D SOD models have no cross-modal interactions or only have unidirectional interactions from depth to RGB in their encoder stages, which may lead to inaccurate encoder features when facing low quality depth. To address this limitation, we propose to conduct progressive bi-directional interactions as early in the encoder stage, yielding a novel bi-directional transfer-and-selection network named BTS-Net, which adopts a set of bi-directional transfer-and-selection (BTS) modules to purify features during encoding. Based on the resulting robust encoder features, we also design an effective light-weight group decoder to achieve accurate final saliency prediction. Comprehensive experiments on six widely used datasets demonstrate that BTS-Net surpasses 16 latest state-of-the-art approaches in terms of four key metrics.

View on arXiv PDF Code

Similar