LG AIFeb 12, 2025

FBFL: A Field-Based Coordination Approach for Data Heterogeneity in Federated Learning

Davide Domini, Gianluca Aguzzi, Lukas Esterle, Mirko Viroli

arXiv:2502.08577v213.06 citationsh-index: 9Has Code

Originality Incremental advance

AI Analysis

This addresses scalability and performance issues in federated learning for privacy-sensitive domains with non-IID data, representing an incremental improvement over existing methods.

The paper tackles the problem of data heterogeneity and centralization bottlenecks in federated learning by proposing FBFL, a field-based coordination approach that uses spatial-based leader election and a self-organizing hierarchical architecture. It demonstrates that FBFL performs comparably to FedAvg under IID conditions and outperforms FedAvg, FedProx, and Scaffold in non-IID scenarios, while also showing resilience to server failures.

In the last years, Federated learning (FL) has become a popular solution to train machine learning models in domains with high privacy concerns. However, FL scalability and performance face significant challenges in real-world deployments where data across devices are non-independently and identically distributed (non-IID). The heterogeneity in data distribution frequently arises from spatial distribution of devices, leading to degraded model performance in the absence of proper handling. Additionally, FL typical reliance on centralized architectures introduces bottlenecks and single-point-of-failure risks, particularly problematic at scale or in dynamic environments. To close this gap, we propose Field-Based Federated Learning (FBFL), a novel approach leveraging macroprogramming and field coordination to address these limitations through: (i) distributed spatial-based leader election for personalization to mitigate non-IID data challenges; and (ii) construction of a self-organizing, hierarchical architecture using advanced macroprogramming patterns. Moreover, FBFL not only overcomes the aforementioned limitations, but also enables the development of more specialized models tailored to the specific data distribution in each subregion. This paper formalizes FBFL and evaluates it extensively using MNIST, FashionMNIST, and Extended MNIST datasets. We demonstrate that, when operating under IID data conditions, FBFL performs comparably to the widely-used FedAvg algorithm. Furthermore, in challenging non-IID scenarios, FBFL not only outperforms FedAvg but also surpasses other state-of-the-art methods, namely FedProx and Scaffold, which have been specifically designed to address non-IID data distributions. Additionally, we showcase the resilience of FBFL's self-organizing hierarchical architecture against server failures.

View on arXiv PDF Code

Similar