CVFeb 2, 2024

Phrase Grounding-based Style Transfer for Single-Domain Generalized Object Detection

arXiv:2402.01304v29 citationsh-index: 9IEEE transactions on circuits and systems for video technology (Print)
AI Analysis

This addresses domain shift in object detection for autonomous driving without needing target domain data, but it is incremental as it builds on existing GLIP models.

The paper tackles the problem of single-domain generalized object detection by proposing a phrase grounding-based style transfer approach, achieving state-of-the-art performance on five weather driving benchmarks and surpassing some domain adaptive methods.

Single-domain generalized object detection aims to enhance a model's generalizability to multiple unseen target domains using only data from a single source domain during training. This is a practical yet challenging task as it requires the model to address domain shift without incorporating target domain data into training. In this paper, we propose a novel phrase grounding-based style transfer (PGST) approach for the task. Specifically, we first define textual prompts to describe potential objects for each unseen target domain. Then, we leverage the grounded language-image pre-training (GLIP) model to learn the style of these target domains and achieve style transfer from the source to the target domain. The style-transferred source visual features are semantically rich and could be close to imaginary counterparts in the target domain. Finally, we employ these style-transferred visual features to fine-tune GLIP. By introducing imaginary counterparts, the detector could be effectively generalized to unseen target domains using only a single source domain for training. Extensive experimental results on five diverse weather driving benchmarks demonstrate our proposed approach achieves state-of-the-art performance, even surpassing some domain adaptive methods that incorporate target domain images into the training process.The source codes and pre-trained models will be made available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes