LGAICVMar 5

FedAFD: Multimodal Federated Learning via Adversarial Fusion and Distillation

arXiv:2603.04890v1
Originality Incremental advance
AI Analysis

This work is significant for researchers and practitioners in federated learning and multimodal AI, aiming to improve model performance and privacy in settings with diverse data modalities and model architectures.

This paper addresses challenges in Multimodal Federated Learning (MFL) where clients have heterogeneous data modalities, focusing on personalized client performance and discrepancies across modalities and tasks. The proposed FedAFD framework enhances learning for both clients and the server, achieving superior performance and efficiency in IID and non-IID settings.

Multimodal Federated Learning (MFL) enables clients with heterogeneous data modalities to collaboratively train models without sharing raw data, offering a privacy-preserving framework that leverages complementary cross-modal information. However, existing methods often overlook personalized client performance and struggle with modality/task discrepancies, as well as model heterogeneity. To address these challenges, we propose FedAFD, a unified MFL framework that enhances client and server learning. On the client side, we introduce a bi-level adversarial alignment strategy to align local and global representations within and across modalities, mitigating modality and task gaps. We further design a granularity-aware fusion module to integrate global knowledge into the personalized features adaptively. On the server side, to handle model heterogeneity, we propose a similarity-guided ensemble distillation mechanism that aggregates client representations on shared public data based on feature similarity and distills the fused knowledge into the global model. Extensive experiments conducted under both IID and non-IID settings demonstrate that FedAFD achieves superior performance and efficiency for both the client and the server.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes