CVJun 10, 2021

AFAN: Augmented Feature Alignment Network for Cross-Domain Object Detection

arXiv:2106.05499v111.647 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of domain shift in object detection for real-world applications, representing an incremental improvement over existing methods.

The paper tackles the problem of unsupervised domain adaptation for object detection by proposing AFAN, which integrates intermediate domain image generation and domain-adversarial training, and it significantly outperforms state-of-the-art methods on standard benchmarks.

Unsupervised domain adaptation for object detection is a challenging problem with many real-world applications. Unfortunately, it has received much less attention than supervised object detection. Models that try to address this task tend to suffer from a shortage of annotated training samples. Moreover, existing methods of feature alignments are not sufficient to learn domain-invariant representations. To address these limitations, we propose a novel augmented feature alignment network (AFAN) which integrates intermediate domain image generation and domain-adversarial training into a unified framework. An intermediate domain image generator is proposed to enhance feature alignments by domain-adversarial training with automatically generated soft domain labels. The synthetic intermediate domain images progressively bridge the domain divergence and augment the annotated source domain training data. A feature pyramid alignment is designed and the corresponding feature discriminator is used to align multi-scale convolutional features of different semantic levels. Last but not least, we introduce a region feature alignment and an instance discriminator to learn domain-invariant features for object proposals. Our approach significantly outperforms the state-of-the-art methods on standard benchmarks for both similar and dissimilar domain adaptations. Further extensive experiments verify the effectiveness of each component and demonstrate that the proposed network can learn domain-invariant representations.

View on arXiv PDF

Similar