LG CVSep 29, 2022

Meta Knowledge Condensation for Federated Learning

arXiv:2209.14851v120.239 citationsh-index: 53

Originality Highly original

AI Analysis

This addresses communication inefficiency for federated learning systems, especially under heterogeneous data distributions, representing a novel method for a known bottleneck.

The paper tackles the high communication cost and data heterogeneity in federated learning by proposing a meta knowledge condensation method with dynamic weight assignment, achieving a significant accuracy improvement from 74.07% to 92.95% on MNIST with only 10 communication rounds.

Existing federated learning paradigms usually extensively exchange distributed models at a central solver to achieve a more powerful model. However, this would incur severe communication burden between a server and multiple clients especially when data distributions are heterogeneous. As a result, current federated learning methods often require a large number of communication rounds in training. Unlike existing paradigms, we introduce an alternative perspective to significantly decrease the communication cost in federate learning. In this work, we first introduce a meta knowledge representation method that extracts meta knowledge from distributed clients. The extracted meta knowledge encodes essential information that can be used to improve the current model. As the training progresses, the contributions of training samples to a federated model also vary. Thus, we introduce a dynamic weight assignment mechanism that enables samples to contribute adaptively to the current model update. Then, informative meta knowledge from all active clients is sent to the server for model update. Training a model on the combined meta knowledge without exposing original data among different clients can significantly mitigate the heterogeneity issues. Moreover, to further ameliorate data heterogeneity, we also exchange meta knowledge among clients as conditional initialization for local meta knowledge extraction. Extensive experiments demonstrate the effectiveness and efficiency of our proposed method. Remarkably, our method outperforms the state-of-the-art by a large margin (from $74.07\%$ to $92.95\%$) on MNIST with a restricted communication budget (i.e. 10 rounds).

View on arXiv PDF

Similar