CVJul 17, 2025

Simulate, Refocus and Ensemble: An Attention-Refocusing Scheme for Domain Generalization

arXiv:2507.12851v11 citationsh-index: 7Has CodeIEEE transactions on multimedia
Originality Incremental advance
AI Analysis

This addresses domain generalization for models like CLIP, which is an incremental improvement in handling domain shifts.

The paper tackles the problem of domain generalization with CLIP, which struggles to focus on domain-invariant regions, by proposing an attention-refocusing scheme called SRE that simulates domain shifts, refocuses attention, and uses ensemble learning, achieving better results than state-of-the-art methods on several datasets.

Domain generalization (DG) aims to learn a model from source domains and apply it to unseen target domains with out-of-distribution data. Owing to CLIP's strong ability to encode semantic concepts, it has attracted increasing interest in domain generalization. However, CLIP often struggles to focus on task-relevant regions across domains, i.e., domain-invariant regions, resulting in suboptimal performance on unseen target domains. To address this challenge, we propose an attention-refocusing scheme, called Simulate, Refocus and Ensemble (SRE), which learns to reduce the domain shift by aligning the attention maps in CLIP via attention refocusing. SRE first simulates domain shifts by performing augmentation on the source data to generate simulated target domains. SRE then learns to reduce the domain shifts by refocusing the attention in CLIP between the source and simulated target domains. Finally, SRE utilizes ensemble learning to enhance the ability to capture domain-invariant attention maps between the source data and the simulated target data. Extensive experimental results on several datasets demonstrate that SRE generally achieves better results than state-of-the-art methods. The code is available at: https://github.com/bitPrincy/SRE-DG.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes