CVApr 11, 2023

CamDiff: Camouflage Image Augmentation via Diffusion Model

Xue-Jing Luo, Shuo Wang, Zongwei Wu, Christos Sakaridis, Yun Cheng, Deng-Ping Fan, Luc Van Gool

arXiv:2304.05469v114.125 citationsh-index: 191Has Code

Originality Incremental advance

AI Analysis

It addresses robustness issues in COD for computer vision applications, but is incremental as it builds on existing diffusion and CLIP models for dataset augmentation.

The paper tackles the problem of camouflaged object detection (COD) models misclassifying salient objects as camouflaged due to limited multi-pattern training data, by introducing CamDiff, a diffusion-based method to synthesize salient objects in camouflaged scenes, which enhances model robustness and improves baseline performance across domains.

The burgeoning field of camouflaged object detection (COD) seeks to identify objects that blend into their surroundings. Despite the impressive performance of recent models, we have identified a limitation in their robustness, where existing methods may misclassify salient objects as camouflaged ones, despite these two characteristics being contradictory. This limitation may stem from lacking multi-pattern training images, leading to less saliency robustness. To address this issue, we introduce CamDiff, a novel approach inspired by AI-Generated Content (AIGC) that overcomes the scarcity of multi-pattern training images. Specifically, we leverage the latent diffusion model to synthesize salient objects in camouflaged scenes, while using the zero-shot image classification ability of the Contrastive Language-Image Pre-training (CLIP) model to prevent synthesis failures and ensure the synthesized object aligns with the input prompt. Consequently, the synthesized image retains its original camouflage label while incorporating salient objects, yielding camouflage samples with richer characteristics. The results of user studies show that the salient objects in the scenes synthesized by our framework attract the user's attention more; thus, such samples pose a greater challenge to the existing COD models. Our approach enables flexible editing and efficient large-scale dataset generation at a low cost. It significantly enhances COD baselines' training and testing phases, emphasizing robustness across diverse domains. Our newly-generated datasets and source code are available at https://github.com/drlxj/CamDiff.

View on arXiv PDF Code

Similar