Global Intervention and Distillation for Federated Out-of-Distribution Generalization
This addresses the problem of unstable convergence and poor out-of-distribution generalization in federated learning for applications like collaborative modeling, though it appears incremental as it builds on prior work in data augmentation and knowledge distillation.
The paper tackles performance degradation in federated learning due to attribute skew by proposing FedGID, a method that uses global intervention and distillation to break spurious associations and improve generalization to unseen data, outperforming existing methods on three datasets.
Attribute skew in federated learning leads local models to focus on learning non-causal associations, guiding them towards inconsistent optimization directions, which inevitably results in performance degradation and unstable convergence. Existing methods typically leverage data augmentation to enhance sample diversity or employ knowledge distillation to learn invariant representations. However, the instability in the quality of generated data and the lack of domain information limit their performance on unseen samples. To address these issues, this paper presents a global intervention and distillation method, termed FedGID, which utilizes diverse attribute features for backdoor adjustment to break the spurious association between background and label. It includes two main modules, where the global intervention module adaptively decouples objects and backgrounds in images, injects background information into random samples to intervene in the sample distribution, which links backgrounds to all categories to prevent the model from treating background-label associations as causal. The global distillation module leverages a unified knowledge base to guide the representation learning of client models, preventing local models from overfitting to client-specific attributes. Experimental results on three datasets demonstrate that FedGID enhances the model's ability to focus on the main subjects in unseen data and outperforms existing methods in collaborative modeling.