LGFeb 10, 2025

Provably Near-Optimal Federated Ensemble Distillation with Negligible Overhead

Won-Jun Jang, Hyeon-Seo Park, Si-Hyeon Lee

arXiv:2502.06349v2h-index: 2ICML

Originality Incremental advance

AI Analysis

This work addresses client heterogeneity in federated learning, offering an incremental improvement with practical benefits for privacy-preserving distributed AI systems.

The paper tackles client heterogeneity in federated ensemble distillation by proposing a provably near-optimal weighting method for pseudo-label generation, which significantly outperforms baselines in image classification tasks with negligible overhead in communication, privacy, and computation.

Federated ensemble distillation addresses client heterogeneity by generating pseudo-labels for an unlabeled server dataset based on client predictions and training the server model using the pseudo-labeled dataset. The unlabeled server dataset can either be pre-existing or generated through a data-free approach. The effectiveness of this approach critically depends on the method of assigning weights to client predictions when creating pseudo-labels, especially in highly heterogeneous settings. Inspired by theoretical results from GANs, we propose a provably near-optimal weighting method that leverages client discriminators trained with a server-distributed generator and local datasets. Our experiments on various image classification tasks demonstrate that the proposed method significantly outperforms baselines. Furthermore, we show that the additional communication cost, client-side privacy leakage, and client-side computational overhead introduced by our method are negligible, both in scenarios with and without a pre-existing server dataset.

View on arXiv PDF

Similar