CVAILGOct 30, 2022

On-the-fly Object Detection using StyleGAN with CLIP Guidance

arXiv:2210.16742v11 citationsh-index: 21
Originality Incremental advance
AI Analysis

This addresses the need for automated object detection in satellite imagery, though it appears incremental as it combines existing models (StyleGAN and CLIP) in a novel way.

The paper tackles the problem of building object detectors on satellite imagery without human annotation by leveraging StyleGAN and CLIP to identify neurons in the generator network for on-the-fly detection.

We present a fully automated framework for building object detectors on satellite imagery without requiring any human annotation or intervention. We achieve this by leveraging the combined power of modern generative models (e.g., StyleGAN) and recent advances in multi-modal learning (e.g., CLIP). While deep generative models effectively encode the key semantics pertinent to a data distribution, this information is not immediately accessible for downstream tasks, such as object detection. In this work, we exploit CLIP's ability to associate image features with text descriptions to identify neurons in the generator network, which are subsequently used to build detectors on-the-fly.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes