Ro-SOS: Metric Expression Network (MEnet) for Robust Salient Object Segmentation
This addresses robustness in image saliency detection for computer vision applications, but it is incremental as it builds on existing CNN approaches with a novel metric space.
The paper tackles the problem of salient object segmentation being sensitive to distortions like compression and noise by proposing the Metric Expression Network (MEnet), which constructs a topological metric space for robust segmentation and outperforms previous CNN-based methods on distorted inputs.
Although deep CNNs have brought significant improvement to image saliency detection, most CNN based models are sensitive to distortion such as compression and noise. In this paper, we propose an end-to-end generic salient object segmentation model called Metric Expression Network (MEnet) to deal with saliency detection with the tolerance of distortion. Within MEnet, a new topological metric space is constructed, whose implicit metric is determined by the deep network. As a result, we manage to group all the pixels in the observed image semantically within this latent space into two regions: a salient region and a non-salient region. With this architecture, all feature extractions are carried out at the pixel level, enabling fine granularity of output boundaries of the salient objects. What's more, we try to give a general analysis for the noise robustness of the network in the sense of Lipschitz and Jacobian literature. Experiments demonstrate that robust salient maps facilitating object segmentation can be generated by the proposed metric. Tests on several public benchmarks show that MEnet has achieved desirable performance. Furthermore, by direct computation and measuring the robustness, the proposed method outperforms previous CNN-based methods on distorted inputs.