CVMar 11, 2025

TRACE: Your Diffusion Model is Secretly an Instance Edge Detector

arXiv:2503.07982v22 citationsh-index: 4Has Code
Originality Incremental advance
AI Analysis

This work addresses the scalability and cost issues in segmentation for computer vision by providing a practical alternative to manual annotation, though it is incremental in leveraging existing diffusion models.

The paper tackles the problem of high-quality instance and panoptic segmentation without costly manual annotations by showing that text-to-image diffusion models can function as instance edge detectors, achieving improvements such as +5.1 AP on COCO for unsupervised instance segmentation and +1.7 PQ for tag-supervised panoptic segmentation.

High-quality instance and panoptic segmentation has traditionally relied on dense instance-level annotations such as masks, boxes, or points, which are costly, inconsistent, and difficult to scale. Unsupervised and weakly-supervised approaches reduce this burden but remain constrained by semantic backbone constraints and human bias, often producing merged or fragmented outputs. We present TRACE (TRAnsforming diffusion Cues to instance Edges), showing that text-to-image diffusion models secretly function as instance edge annotators. TRACE identifies the Instance Emergence Point (IEP) where object boundaries first appear in self-attention maps, extracts boundaries through Attention Boundary Divergence (ABDiv), and distills them into a lightweight one-step edge decoder. This design removes the need for per-image diffusion inversion, achieving 81x faster inference while producing sharper and more connected boundaries. On the COCO benchmark, TRACE improves unsupervised instance segmentation by +5.1 AP, and in tag-supervised panoptic segmentation it outperforms point-supervised baselines by +1.7 PQ without using any instance-level labels. These results reveal that diffusion models encode hidden instance boundary priors, and that decoding these signals offers a practical and scalable alternative to costly manual annotation. Code is available at https://github.com/shjo-april/DiffEGG.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes