FedBN: Federated Learning on Non-IID Features via Local Batch Normalization
This addresses data heterogeneity issues in federated learning for applications like medical imaging and autonomous driving, but it is incremental as it builds on existing FL methods with a specific adaptation.
The paper tackles the problem of feature shift non-IID data in federated learning, where local clients have different data distributions, and proposes FedBN, which uses local batch normalization to improve performance. The method outperforms FedAvg and FedProx in experiments, with a convergence analysis showing faster convergence rates.
The emerging paradigm of federated learning (FL) strives to enable collaborative training of deep models on the network edge without centrally aggregating raw data and hence improving data privacy. In most cases, the assumption of independent and identically distributed samples across local clients does not hold for federated learning setups. Under this setting, neural network training performance may vary significantly according to the data distribution and even hurt training convergence. Most of the previous work has focused on a difference in the distribution of labels or client shifts. Unlike those settings, we address an important problem of FL, e.g., different scanners/sensors in medical imaging, different scenery distribution in autonomous driving (highway vs. city), where local clients store examples with different distributions compared to other clients, which we denote as feature shift non-iid. In this work, we propose an effective method that uses local batch normalization to alleviate the feature shift before averaging models. The resulting scheme, called FedBN, outperforms both classical FedAvg, as well as the state-of-the-art for non-iid data (FedProx) on our extensive experiments. These empirical results are supported by a convergence analysis that shows in a simplified setting that FedBN has a faster convergence rate than FedAvg. Code is available at https://github.com/med-air/FedBN.