LG AI DCNov 2, 2024

Optimizing Federated Learning by Entropy-Based Client Selection

Andreas Lutz, Gabriele Steidl, Karsten Müller, Wojciech Samek

arXiv:2411.01240v34.64 citationsh-index: 52025 3rd International Conference on Federated Learning Technologies and Applications (FLTA)

Originality Incremental advance

AI Analysis

This addresses privacy-preserving collaborative learning for domains like healthcare or finance by improving model robustness under data heterogeneity, though it is an incremental advance over existing federated learning methods.

The paper tackled the problem of performance degradation in federated learning due to label skew by proposing FedEntOpt, a client selection method that maximizes the entropy of aggregated label distributions, resulting in up to 6% higher classification accuracy on benchmarks and over 30% gains in low-participation scenarios.

Although deep learning has revolutionized domains such as natural language processing and computer vision, its dependence on centralized datasets raises serious privacy concerns. Federated learning addresses this issue by enabling multiple clients to collaboratively train a global deep learning model without compromising their data privacy. However, the performance of such a model degrades under label skew, where the label distribution differs between clients. To overcome this issue, a novel method called FedEntOpt is proposed. In each round, it selects clients to maximize the entropy of the aggregated label distribution, ensuring that the global model is exposed to data from all available classes. Extensive experiments on multiple benchmark datasets show that the proposed method outperforms several state-of-the-art algorithms by up to 6% in classification accuracy under standard settings regardless of the model size, while achieving gains of over 30% in scenarios with low participation rates and client dropout. In addition, FedEntOpt offers the flexibility to be combined with existing algorithms, enhancing their classification accuracy by more than 40%. Importantly, its performance remains unaffected even when differential privacy is applied.

View on arXiv PDF

Similar