FedAL: Black-Box Federated Knowledge Distillation Enabled by Adversarial Learning
This addresses data heterogeneity in federated learning for distributed clients, but it is incremental as it builds on existing federated knowledge distillation methods.
The paper tackles performance degradation in federated knowledge distillation under heterogeneous client data by introducing adversarial learning to align model outputs and a regularization technique to prevent catastrophic forgetting, achieving higher accuracy than existing baselines.
Knowledge distillation (KD) can enable collaborative learning among distributed clients that have different model architectures and do not share their local data and model parameters with others. Each client updates its local model using the average model output/feature of all client models as the target, known as federated KD. However, existing federated KD methods often do not perform well when clients' local models are trained with heterogeneous local datasets. In this paper, we propose Federated knowledge distillation enabled by Adversarial Learning (FedAL) to address the data heterogeneity among clients. First, to alleviate the local model output divergence across clients caused by data heterogeneity, the server acts as a discriminator to guide clients' local model training to achieve consensus model outputs among clients through a min-max game between clients and the discriminator. Moreover, catastrophic forgetting may happen during the clients' local training and global knowledge transfer due to clients' heterogeneous local data. Towards this challenge, we design the less-forgetting regularization for both local training and global knowledge transfer to guarantee clients' ability to transfer/learn knowledge to/from others. Experimental results show that FedAL and its variants achieve higher accuracy than other federated KD baselines.