OCAILGNov 3, 2021

Federated Expectation Maximization with heterogeneity mitigation and variance reduction

arXiv:2111.02083v29 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of scalable latent variable model inference for distributed datasets, offering a novel federated approach with incremental improvements in efficiency and robustness.

The paper tackles the challenge of applying the Expectation Maximization algorithm to large datasets in federated learning by introducing FedEM, a communication-efficient method that handles device heterogeneity and partial participation, achieving finite-time complexity bounds and demonstrating results in federated missing values imputation.

The Expectation Maximization (EM) algorithm is the default algorithm for inference in latent variable models. As in any other field of machine learning, applications of latent variable models to very large datasets make the use of advanced parallel and distributed architectures mandatory. This paper introduces FedEM, which is the first extension of the EM algorithm to the federated learning context. FedEM is a new communication efficient method, which handles partial participation of local devices, and is robust to heterogeneous distributions of the datasets. To alleviate the communication bottleneck, FedEM compresses appropriately defined complete data sufficient statistics. We also develop and analyze an extension of FedEM to further incorporate a variance reduction scheme. In all cases, we derive finite-time complexity bounds for smooth non-convex problems. Numerical results are presented to support our theoretical findings, as well as an application to federated missing values imputation for biodiversity monitoring.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes