CV LGSep 8, 2024

Can OOD Object Detectors Learn from Foundation Models?

Jiahui Liu, Xin Wen, Shizhen Zhao, Yingxian Chen, Xiaojuan Qi

arXiv:2409.05162v112.816 citationsh-index: 12Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge of OOD object detection for computer vision applications, offering an incremental improvement by leveraging foundation models.

The paper tackled the problem of out-of-distribution (OOD) object detection by using text-to-image generative models to synthesize OOD samples, resulting in SyncOOD significantly outperforming existing methods and establishing new state-of-the-art performance with minimal synthetic data usage.

Out-of-distribution (OOD) object detection is a challenging task due to the absence of open-set OOD data. Inspired by recent advancements in text-to-image generative models, such as Stable Diffusion, we study the potential of generative models trained on large-scale open-set data to synthesize OOD samples, thereby enhancing OOD object detection. We introduce SyncOOD, a simple data curation method that capitalizes on the capabilities of large foundation models to automatically extract meaningful OOD data from text-to-image generative models. This offers the model access to open-world knowledge encapsulated within off-the-shelf foundation models. The synthetic OOD samples are then employed to augment the training of a lightweight, plug-and-play OOD detector, thus effectively optimizing the in-distribution (ID)/OOD decision boundaries. Extensive experiments across multiple benchmarks demonstrate that SyncOOD significantly outperforms existing methods, establishing new state-of-the-art performance with minimal synthetic data usage.

View on arXiv PDF Code

Similar