Robustness analytics to data heterogeneity in edge computing
This work addresses robustness issues in Federated Learning for edge computing applications, but it is incremental as it builds on existing frameworks with experimental validation.
The authors investigated the robustness of Federated Learning to data heterogeneity, specifically addressing biased sampling from categorical heterogeneity and active sampling at the edge, and found that it remains robust when local training iterations and communication frequency are properly adjusted.
Federated Learning is a framework that jointly trains a model \textit{with} complete knowledge on a remotely placed centralized server, but \textit{without} the requirement of accessing the data stored in distributed machines. Some work assumes that the data generated from edge devices are identically and independently sampled from a common population distribution. However, such ideal sampling may not be realistic in many contexts. Also, models based on intrinsic agency, such as active sampling schemes, may lead to highly biased sampling. So an imminent question is how robust Federated Learning is to biased sampling? In this work\footnote{\url{https://github.com/jiaqian/robustness_of_FL}}, we experimentally investigate two such scenarios. First, we study a centralized classifier aggregated from a collection of local classifiers trained with data having categorical heterogeneity. Second, we study a classifier aggregated from a collection of local classifiers trained by data through active sampling at the edge. We present evidence in both scenarios that Federated Learning is robust to data heterogeneity when local training iterations and communication frequency are appropriately chosen.