Communication-Efficient Hierarchical Federated Learning for IoT Heterogeneous Systems with Imbalanced Data
This work addresses communication efficiency and data imbalance issues in federated learning for IoT telemonitoring systems, representing an incremental improvement over existing hierarchical FL schemes.
This paper tackles the problem of communication bottlenecks and imbalanced data in hierarchical federated learning for IoT systems by proposing an optimized solution for user assignment and resource allocation, achieving 4-6% higher classification accuracy and 75-85% reduction in communication rounds compared to distance-based approaches.
Federated learning (FL) is a distributed learning methodology that allows multiple nodes to cooperatively train a deep learning model, without the need to share their local data. It is a promising solution for telemonitoring systems that demand intensive data collection, for detection, classification, and prediction of future events, from different locations while maintaining a strict privacy constraint. Due to privacy concerns and critical communication bottlenecks, it can become impractical to send the FL updated models to a centralized server. Thus, this paper studies the potential of hierarchical FL in IoT heterogeneous systems and propose an optimized solution for user assignment and resource allocation on multiple edge nodes. In particular, this work focuses on a generic class of machine learning models that are trained using gradient-descent-based schemes while considering the practical constraints of non-uniformly distributed data across different users. We evaluate the proposed system using two real-world datasets, and we show that it outperforms state-of-the-art FL solutions. In particular, our numerical results highlight the effectiveness of our approach and its ability to provide 4-6% increase in the classification accuracy, with respect to hierarchical FL schemes that consider distance-based user assignment. Furthermore, the proposed approach could significantly accelerate FL training and reduce communication overhead by providing 75-85% reduction in the communication rounds between edge nodes and the centralized server, for the same model accuracy.