LGAINov 29, 2021

The Impact of Data Distribution on Fairness and Robustness in Federated Learning

arXiv:2112.01274v11 citations
Originality Incremental advance
AI Analysis

This highlights a critical issue for deploying Federated Learning in fairness- and security-sensitive applications, showing that even small data variations can compromise model integrity.

The study investigated how variations in local data distributions in Federated Learning affect fairness and robustness, finding that models exhibit higher bias and become more susceptible to attacks as distributions differ, with these degradations often more severe than accuracy drops.

Federated Learning (FL) is a distributed machine learning protocol that allows a set of agents to collaboratively train a model without sharing their datasets. This makes FL particularly suitable for settings where data privacy is desired. However, it has been observed that the performance of FL is closely related to the similarity of the local data distributions of agents. Particularly, as the data distributions of agents differ, the accuracy of the trained models drop. In this work, we look at how variations in local data distributions affect the fairness and the robustness properties of the trained models in addition to the accuracy. Our experimental results indicate that, the trained models exhibit higher bias, and become more susceptible to attacks as local data distributions differ. Importantly, the degradation in the fairness, and robustness can be much more severe than the accuracy. Therefore, we reveal that small variations that have little impact on the accuracy could still be important if the trained model is to be deployed in a fairness/security critical context.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes