Enhancing Salient Object Segmentation Through Attention
This work addresses a specific challenge in computer vision for applications like image editing and object detection, but it is incremental as it builds on existing segmentation methods.
The paper tackles the problem of false positives in salient object segmentation caused by cluttered backgrounds and low-quality images by proposing an iterative attention mechanism using ConvGRU to enhance segmentation masks, achieving superior performance on benchmark datasets without post-processing.
Segmenting salient objects in an image is an important vision task with ubiquitous applications. The problem becomes more challenging in the presence of a cluttered and textured background, low resolution and/or low contrast images. Even though existing algorithms perform well in segmenting most of the object(s) of interest, they often end up segmenting false positives due to resembling salient objects in the background. In this work, we tackle this problem by iteratively attending to image patches in a recurrent fashion and subsequently enhancing the predicted segmentation mask. Saliency features are estimated independently for every image patch, which are further combined using an aggregation strategy based on a Convolutional Gated Recurrent Unit (ConvGRU) network. The proposed approach works in an end-to-end manner, removing background noise and false positives incrementally. Through extensive evaluation on various benchmark datasets, we show superior performance to the existing approaches without any post-processing.