Scale Federated Learning for Label Set Mismatch in Medical Image Classification
This addresses a practical issue in healthcare FL where specialists annotate only their expertise, improving model training for medical diagnosis.
The paper tackled the problem of label set mismatch in federated learning for medical image classification, where clients have different or disjoint label sets, and proposed FedLSM, which outperformed state-of-the-art FL algorithms on chest X-ray and skin lesion datasets with 112,120 and 10,015 images, respectively.
Federated learning (FL) has been introduced to the healthcare domain as a decentralized learning paradigm that allows multiple parties to train a model collaboratively without privacy leakage. However, most previous studies have assumed that every client holds an identical label set. In reality, medical specialists tend to annotate only diseases within their area of expertise or interest. This implies that label sets in each client can be different and even disjoint. In this paper, we propose the framework FedLSM to solve the problem of Label Set Mismatch. FedLSM adopts different training strategies on data with different uncertainty levels to efficiently utilize unlabeled or partially labeled data as well as class-wise adaptive aggregation in the classification layer to avoid inaccurate aggregation when clients have missing labels. We evaluated FedLSM on two public real-world medical image datasets, including chest X-ray (CXR) diagnosis with 112,120 CXR images and skin lesion diagnosis with 10,015 dermoscopy images, and showed that it significantly outperformed other state-of-the-art FL algorithms. The code can be found at https://github.com/dzp2095/FedLSM.