LGMLAug 18, 2023

Understanding the Role of Layer Normalization in Label-Skewed Federated Learning

arXiv:2308.09565v25 citationsh-index: 24Has Code
AI Analysis

This addresses the challenge of training federated learning models on non-i.i.d. data, which is common in real-world applications, though it appears incremental in explaining existing techniques.

The paper investigates why layer normalization improves federated learning performance under label-skewed data, revealing that feature normalization prevents feature collapse and local overfitting, leading to drastic improvements on standard benchmarks under extreme label shift.

Layer normalization (LN) is a widely adopted deep learning technique especially in the era of foundation models. Recently, LN has been shown to be surprisingly effective in federated learning (FL) with non-i.i.d. data. However, exactly why and how it works remains mysterious. In this work, we reveal the profound connection between layer normalization and the label shift problem in federated learning. To understand layer normalization better in FL, we identify the key contributing mechanism of normalization methods in FL, called feature normalization (FN), which applies normalization to the latent feature representation before the classifier head. Although LN and FN do not improve expressive power, they control feature collapse and local overfitting to heavily skewed datasets, and thus accelerates global training. Empirically, we show that normalization leads to drastic improvements on standard benchmarks under extreme label shift. Moreover, we conduct extensive ablation studies to understand the critical factors of layer normalization in FL. Our results verify that FN is an essential ingredient inside LN to significantly improve the convergence of FL while remaining robust to learning rate choices, especially under extreme label shift where each client has access to few classes. Our code is available at \url{https://github.com/huawei-noah/Federated-Learning/tree/main/Layer_Normalization}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes