Deep Object Co-Segmentation
This work addresses object co-segmentation for computer vision applications, representing an incremental improvement with a novel method for a known bottleneck.
The paper tackles the problem of segmenting common objects of the same class in image pairs by ignoring background distractions, proposing a CNN-based Siamese encoder-decoder architecture that outperforms competing methods on standard datasets for both seen and unseen object classes.
This work presents a deep object co-segmentation (DOCS) approach for segmenting common objects of the same class within a pair of images. This means that the method learns to ignore common, or uncommon, background stuff and focuses on objects. If multiple object classes are presented in the image pair, they are jointly extracted as foreground. To address this task, we propose a CNN-based Siamese encoder-decoder architecture. The encoder extracts high-level semantic features of the foreground objects, a mutual correlation layer detects the common objects, and finally, the decoder generates the output foreground masks for each image. To train our model, we compile a large object co-segmentation dataset consisting of image pairs from the PASCAL VOC dataset with common objects masks. We evaluate our approach on commonly used datasets for co-segmentation tasks and observe that our approach consistently outperforms competing methods, for both seen and unseen object classes.