LGMLApr 7, 2020

FedMAX: Mitigating Activation Divergence for Accurate and Communication-Efficient Federated Learning

arXiv:2004.03657v242 citations
AI Analysis

This addresses data heterogeneity issues in Federated Learning for applications like medical data analysis, but it is incremental as it builds on existing FL methods.

The paper tackles the problem of activation divergence in Federated Learning caused by non-IID data, proposing FedMAX to improve accuracy and communication efficiency. Results show better accuracy and efficiency than state-of-the-art methods on benchmarks and medical datasets.

In this paper, we identify a new phenomenon called activation-divergence which occurs in Federated Learning (FL) due to data heterogeneity (i.e., data being non-IID) across multiple users. Specifically, we argue that the activation vectors in FL can diverge, even if subsets of users share a few common classes with data residing on different devices. To address the activation-divergence issue, we introduce a prior based on the principle of maximum entropy; this prior assumes minimal information about the per-device activation vectors and aims at making the activation vectors of same classes as similar as possible across multiple devices. Our results show that, for both IID and non-IID settings, our proposed approach results in better accuracy (due to the significantly more similar activation vectors across multiple devices), and is more communication-efficient than state-of-the-art approaches in FL. Finally, we illustrate the effectiveness of our approach on a few common benchmarks and two large medical datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes