CVMay 19

What Makes Synthetic Data Effective in Image Segmentation

Jinjin Zhang, Xiefan Guo, Yizhou Jin, Nan Zhou, Di Huang

arXiv:2605.1928983.3Has Code

AI Analysis

For computer vision researchers, this work provides a model-agnostic method to leverage synthetic data for segmentation, though the gains are incremental over existing augmentation techniques.

The paper systematically analyzes synthetic images from diffusion models for segmentation, finding that dense scene composition and fine instance fidelity improve spatial representations. Their SENSE framework boosts segmentation performance across Cityscapes, COCO, and ADE20K, achieving gains (e.g., +2.1 mIoU on Cityscapes).

Driven by rapid advances in large-scale generative models, synthetic data has emerged as a promising solution for visual understanding. While modern diffusion models achieve remarkable photorealistic image synthesis, their potential in complex visual segmentation tasks remains underexplored. In this work, we conduct a systematic analysis of synthetic images from state-of-the-art diffusion models to uncover the factors governing their utility. In particular, synthetic images characterized by dense scene composition and fine instance fidelity demonstrate distinctive benefits, yielding significantly more discriminative spatial representations. Building on these insights, we propose SENSE, a unified framework that leverages flexible and scalable synthetic data to substantially enhance segmentation performance. Notably, SENSE is model-agnostic, compatible with diverse architectures (e.g., DPT and Mask2Former), and scales effectively across models with varying parameter capacities. Extensive experiments on Cityscapes, COCO, and ADE20K validate the effectiveness and generalization capability of our approach. Code is available at https://github.com/zhang0jhon/SENSE.

View on arXiv PDF Code

Similar