LGCVMMDec 31, 2023

Balanced Multi-modal Federated Learning via Cross-Modal Infiltration

arXiv:2401.00894v15 citationsh-index: 22
Originality Incremental advance
AI Analysis

This work addresses a fundamental issue in privacy-preserving distributed computing for multimodal data, though it appears incremental as it builds on existing multimodal FL solutions.

The paper tackles the problem of modality imbalance in multimodal federated learning, which leads to inadequate information exploitation and heterogeneous knowledge aggregation, by proposing a FedCMI framework that uses cross-modal knowledge transfer and achieves improved performance on popular datasets.

Federated learning (FL) underpins advancements in privacy-preserving distributed computing by collaboratively training neural networks without exposing clients' raw data. Current FL paradigms primarily focus on uni-modal data, while exploiting the knowledge from distributed multimodal data remains largely unexplored. Existing multimodal FL (MFL) solutions are mainly designed for statistical or modality heterogeneity from the input side, however, have yet to solve the fundamental issue,"modality imbalance", in distributed conditions, which can lead to inadequate information exploitation and heterogeneous knowledge aggregation on different modalities.In this paper, we propose a novel Cross-Modal Infiltration Federated Learning (FedCMI) framework that effectively alleviates modality imbalance and knowledge heterogeneity via knowledge transfer from the global dominant modality. To avoid the loss of information in the weak modality due to merely imitating the behavior of dominant modality, we design the two-projector module to integrate the knowledge from dominant modality while still promoting the local feature exploitation of weak modality. In addition, we introduce a class-wise temperature adaptation scheme to achieve fair performance across different classes. Extensive experiments over popular datasets are conducted and give us a gratifying confirmation of the proposed framework for fully exploring the information of each modality in MFL.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes