LG DCDec 5, 2021

Joint Superposition Coding and Training for Federated Learning over Multi-Width Neural Networks

Hankyul Baek, Won Joon Yun, Yunseok Kwak, Soyi Jung, Mingyue Ji, Mehdi Bennis, Jihong Park, Joongheon Kim

arXiv:2112.02543v111.331 citations

Originality Incremental advance

AI Analysis

This work addresses communication and energy efficiency in federated learning for mobile devices with varying capacities, though it is incremental as it builds on existing SNN and FL methods.

The paper tackles the challenge of integrating federated learning with slimmable neural networks under non-IID data and poor wireless conditions, proposing SlimFL which uses superposition coding and training to achieve communication efficiency and robustness, with simulations corroborating its effectiveness.

This paper aims to integrate two synergetic technologies, federated learning (FL) and width-adjustable slimmable neural network (SNN) architectures. FL preserves data privacy by exchanging the locally trained models of mobile devices. By adopting SNNs as local models, FL can flexibly cope with the time-varying energy capacities of mobile devices. Combining FL and SNNs is however non-trivial, particularly under wireless connections with time-varying channel conditions. Furthermore, existing multi-width SNN training algorithms are sensitive to the data distributions across devices, so are ill-suited to FL. Motivated by this, we propose a communication and energy-efficient SNN-based FL (named SlimFL) that jointly utilizes superposition coding (SC) for global model aggregation and superposition training (ST) for updating local models. By applying SC, SlimFL exchanges the superposition of multiple width configurations that are decoded as many as possible for a given communication throughput. Leveraging ST, SlimFL aligns the forward propagation of different width configurations, while avoiding the inter-width interference during backpropagation. We formally prove the convergence of SlimFL. The result reveals that SlimFL is not only communication-efficient but also can counteract non-IID data distributions and poor channel conditions, which is also corroborated by simulations.

View on arXiv PDF

Similar