CVSep 4, 2024

Pluralistic Salient Object Detection

arXiv:2409.02368v13 citationsh-index: 8
Originality Incremental advance
AI Analysis

This addresses the ambiguity in defining salient objects in real-world images for computer vision applications, though it is incremental as it builds upon existing SOD methods.

The paper tackles the problem of generating multiple plausible salient segmentation results for a given image, introducing the pluralistic salient object detection (PSOD) task, and presents two new datasets (DUTS-MM and DUS-MQ) and a Mixture-of-Experts baseline that predicts multiple masks and human preference scores, with experiments affirming their effectiveness.

We introduce pluralistic salient object detection (PSOD), a novel task aimed at generating multiple plausible salient segmentation results for a given input image. Unlike conventional SOD methods that produce a single segmentation mask for salient objects, this new setting recognizes the inherent complexity of real-world images, comprising multiple objects, and the ambiguity in defining salient objects due to different user intentions. To study this task, we present two new SOD datasets "DUTS-MM" and "DUS-MQ", along with newly designed evaluation metrics. DUTS-MM builds upon the DUTS dataset but enriches the ground-truth mask annotations from three aspects which 1) improves the mask quality especially for boundary and fine-grained structures; 2) alleviates the annotation inconsistency issue; and 3) provides multiple ground-truth masks for images with saliency ambiguity. DUTS-MQ consists of approximately 100K image-mask pairs with human-annotated preference scores, enabling the learning of real human preferences in measuring mask quality. Building upon these two datasets, we propose a simple yet effective pluralistic SOD baseline based on a Mixture-of-Experts (MOE) design. Equipped with two prediction heads, it simultaneously predicts multiple masks using different query prompts and predicts human preference scores for each mask candidate. Extensive experiments and analyses underscore the significance of our proposed datasets and affirm the effectiveness of our PSOD framework.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes