LGJan 5, 2022

Towards Understanding Quality Challenges of the Federated Learning for Neural Networks: A First Look from the Lens of Robustness

arXiv:2201.01409v21 citations
AI Analysis

This incremental study addresses quality challenges in federated learning for researchers and practitioners by empirically assessing robustness across datasets.

The paper tackles the problem of evaluating the robustness of state-of-the-art federated learning techniques against attacks and faults, finding that model poisoning attacks are more effective than data poisoning, and a simple ensemble of aggregators provides the most robust solution in 75% of cases.

Federated learning (FL) is a distributed learning paradigm that preserves users' data privacy while leveraging the entire dataset of all participants. In FL, multiple models are trained independently on the clients and aggregated centrally to update a global model in an iterative process. Although this approach is excellent at preserving privacy, FL still suffers from quality issues such as attacks or byzantine faults. Recent attempts have been made to address such quality challenges on the robust aggregation techniques for FL. However, the effectiveness of state-of-the-art (SOTA) robust FL techniques is still unclear and lacks a comprehensive study. Therefore, to better understand the current quality status and challenges of these SOTA FL techniques in the presence of attacks and faults, we perform a large-scale empirical study to investigate the SOTA FL's quality from multiple angles of attacks, simulated faults (via mutation operators), and aggregation (defense) methods. In particular, we study FL's performance on the image classification tasks and use DNNs as our model type. Furthermore, we perform our study on two generic image datasets and one real-world federated medical image dataset. We also investigate the effect of the proportion of affected clients and the dataset distribution factors on the robustness of FL. After a large-scale analysis with 496 configurations, we find that most mutators on each user have a negligible effect on the final model in the generic datasets, and only one of them is effective in the medical dataset. Furthermore, we show that model poisoning attacks are more effective than data poisoning attacks. Moreover, choosing the most robust FL aggregator depends on the attacks and datasets. Finally, we illustrate that a simple ensemble of aggregators achieves a more robust solution than any single aggregator and is the best choice in 75% of the cases.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes