CVDec 14, 2024

Rethinking Detecting Salient and Camouflaged Objects in Unconstrained Scenes

arXiv:2412.10943v34 citationsh-index: 11Has Code
Originality Incremental advance
AI Analysis

This addresses a key challenge in computer vision for applications like surveillance and robotics by moving beyond constrained datasets to real-world scenarios, though it is incremental as it builds on existing SOD/COD methods.

The paper tackles the problem of disentangling salient and camouflaged object detection in unconstrained scenes, where existing models often misclassify these objects due to dataset constraints and lack of relationship modeling, and it achieves state-of-the-art performance across all scenes with a new dataset, model, and evaluation metric.

While the human visual system employs distinct mechanisms to perceive salient and camouflaged objects, existing models struggle to disentangle these tasks. Specifically, salient object detection (SOD) models frequently misclassify camouflaged objects as salient, while camouflaged object detection (COD) models conversely misinterpret salient objects as camouflaged. We hypothesize that this can be attributed to two factors: (i) the specific annotation paradigm of current SOD and COD datasets, and (ii) the lack of explicit attribute relationship modeling in current models. Prevalent SOD/COD datasets enforce a mutual exclusivity constraint, assuming scenes contain either salient or camouflaged objects, which poorly aligns with the real world. Furthermore, current SOD/COD methods are primarily designed for these highly constrained datasets and lack explicit modeling of the relationship between salient and camouflaged objects. In this paper, to promote the development of unconstrained salient and camouflaged object detection, we construct a large-scale dataset, USC12K, which features comprehensive labels and four different scenes that cover all possible logical existence scenarios of both salient and camouflaged objects. To explicitly model the relationship between salient and camouflaged objects, we propose a model called USCNet, which introduces two distinct prompt query mechanisms for modeling inter-sample and intra-sample attribute relationships. Additionally, to assess the model's ability to distinguish between salient and camouflaged objects, we design an evaluation metric called CSCS. The proposed method achieves state-of-the-art performance across all scenes in various metrics. The code and dataset will be available at https://github.com/ssecv/USCNet.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes