LGMar 8, 2025

Data-Free Black-Box Federated Learning via Zeroth-Order Gradient Estimation

arXiv:2503.06028v11 citationsh-index: 23AAAI
Originality Incremental advance
AI Analysis

This work addresses federated learning challenges for decentralized clients by combining distillation-based and data-free approaches, though it is incremental as it builds on existing methods.

The paper tackles the problem of communication burden, privacy risks, and client heterogeneity in federated learning by proposing FedZGE, a data-free and black-box framework that uses zeroth-order gradient estimation to train a server-side generator without auxiliary data or model sharing, achieving superior performance in experiments on image classification datasets.

Federated learning (FL) enables decentralized clients to collaboratively train a global model under the orchestration of a central server without exposing their individual data. However, the iterative exchange of model parameters between the server and clients imposes heavy communication burdens, risks potential privacy leakage, and even precludes collaboration among heterogeneous clients. Distillation-based FL tackles these challenges by exchanging low-dimensional model outputs rather than model parameters, yet it highly relies on a task-relevant auxiliary dataset that is often not available in practice. Data-free FL attempts to overcome this limitation by training a server-side generator to directly synthesize task-specific data samples for knowledge transfer. However, the update rule of the generator requires clients to share on-device models for white-box access, which greatly compromises the advantages of distillation-based FL. This motivates us to explore a data-free and black-box FL framework via Zeroth-order Gradient Estimation (FedZGE), which estimates the gradients after flowing through on-device models in a black-box optimization manner to complete the training of the generator in terms of fidelity, transferability, diversity, and equilibrium, without involving any auxiliary data or sharing any model parameters, thus combining the advantages of both distillation-based FL and data-free FL. Experiments on large-scale image classification datasets and network architectures demonstrate the superiority of FedZGE in terms of data heterogeneity, model heterogeneity, communication efficiency, and privacy protection.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes