CVJan 12

GenDet: Painting Colored Bounding Boxes on Images via Diffusion Model for Object Detection

Chen Min, Chengyang Li, Fanjie Kong, Qi Zhu, Dawei Zhao, Liang Xiao

arXiv:2601.07273v11.5h-index: 9

Originality Incremental advance

AI Analysis

This provides a novel approach to object detection that bridges generative and discriminative methods, though it appears incremental in performance.

The paper tackles object detection by reframing it as an image generation task using a diffusion model, achieving competitive accuracy while maintaining generative flexibility.

This paper presents GenDet, a novel framework that redefines object detection as an image generation task. In contrast to traditional approaches, GenDet adopts a pioneering approach by leveraging generative modeling: it conditions on the input image and directly generates bounding boxes with semantic annotations in the original image space. GenDet establishes a conditional generation architecture built upon the large-scale pre-trained Stable Diffusion model, formulating the detection task as semantic constraints within the latent space. It enables precise control over bounding box positions and category attributes, while preserving the flexibility of the generative model. This novel methodology effectively bridges the gap between generative models and discriminative tasks, providing a fresh perspective for constructing unified visual understanding systems. Systematic experiments demonstrate that GenDet achieves competitive accuracy compared to discriminative detectors, while retaining the flexibility characteristic of generative methods.

View on arXiv PDF

Similar