Attention-based Class-Conditioned Alignment for Multi-Source Domain Adaptation of Object Detectors
This work improves object detection accuracy in real-world scenarios with domain shifts, though it is incremental as it builds on existing prototype-based approaches.
The paper tackles multi-source domain adaptation for object detectors by proposing an attention-based class-conditioned alignment method to address challenges like class-agnostic alignment and error accumulation from noisy pseudo-labels, achieving state-of-the-art performance and robustness to class imbalance on multiple benchmarking datasets.
Domain adaptation methods for object detection (OD) strive to mitigate the impact of distribution shifts by promoting feature alignment across source and target domains. Multi-source domain adaptation (MSDA) allows leveraging multiple annotated source datasets and unlabeled target data to improve the accuracy and robustness of the detection model. Most state-of-the-art MSDA methods for OD perform feature alignment in a class-agnostic manner. This is challenging since the objects have unique modality information due to variations in object appearance across domains. A recent prototype-based approach proposed a class-wise alignment, yet it suffers from error accumulation caused by noisy pseudo-labels that can negatively affect adaptation with imbalanced data. To overcome these limitations, we propose an attention-based class-conditioned alignment method for MSDA, designed to align instances of each object category across domains. In particular, an attention module combined with an adversarial domain classifier allows learning domain-invariant and class-specific instance representations. Experimental results on multiple benchmarking MSDA datasets indicate that our method outperforms state-of-the-art methods and exhibits robustness to class imbalance, achieved through a conceptually simple class-conditioning strategy. Our code is available at: https://github.com/imatif17/ACIA.