FedMM: Federated Multi-Modal Learning with Modality Heterogeneity in Computational Pathology
This addresses privacy concerns for hospitals in computational pathology by enabling federated multimodal learning with heterogeneous data, though it is incremental as it builds on existing federated learning methods.
The paper tackles the problem of privacy risks in multimodal learning for computational pathology by proposing FedMM, a federated learning framework that trains single-modal feature extractors without sharing raw data, and demonstrates it outperforms baselines in accuracy and AUC on two datasets.
The fusion of complementary multimodal information is crucial in computational pathology for accurate diagnostics. However, existing multimodal learning approaches necessitate access to users' raw data, posing substantial privacy risks. While Federated Learning (FL) serves as a privacy-preserving alternative, it falls short in addressing the challenges posed by heterogeneous (yet possibly overlapped) modalities data across various hospitals. To bridge this gap, we propose a Federated Multi-Modal (FedMM) learning framework that federatedly trains multiple single-modal feature extractors to enhance subsequent classification performance instead of existing FL that aims to train a unified multimodal fusion model. Any participating hospital, even with small-scale datasets or limited devices, can leverage these federated trained extractors to perform local downstream tasks (e.g., classification) while ensuring data privacy. Through comprehensive evaluations of two publicly available datasets, we demonstrate that FedMM notably outperforms two baselines in accuracy and AUC metrics.