LGAIDCMar 19, 2023

Experimenting with Normalization Layers in Federated Learning on non-IID scenarios

arXiv:2303.10630v130 citationsh-index: 30Has Code
Originality Incremental advance
AI Analysis

This work addresses a critical bottleneck in federated learning for privacy-preserving collaborative training, but it is incremental as it focuses on hyperparameter optimization.

The paper tackled performance challenges in federated learning on non-IID data by benchmarking normalization layers and collaboration frequency, finding that Group and Layer Normalization outperform Batch Normalization and frequent aggregation reduces convergence speed and model quality.

Training Deep Learning (DL) models require large, high-quality datasets, often assembled with data from different institutions. Federated Learning (FL) has been emerging as a method for privacy-preserving pooling of datasets employing collaborative training from different institutions by iteratively globally aggregating locally trained models. One critical performance challenge of FL is operating on datasets not independently and identically distributed (non-IID) among the federation participants. Even though this fragility cannot be eliminated, it can be debunked by a suitable optimization of two hyper-parameters: layer normalization methods and collaboration frequency selection. In this work, we benchmark five different normalization layers for training Neural Networks (NNs), two families of non-IID data skew, and two datasets. Results show that Batch Normalization, widely employed for centralized DL, is not the best choice for FL, whereas Group and Layer Normalization consistently outperform Batch Normalization. Similarly, frequent model aggregation decreases convergence speed and mode quality.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes