LG DCApr 14, 2023

TimelyFL: Heterogeneity-aware Asynchronous Federated Learning with Adaptive Partial Training

Tuo Zhang, Lei Gao, Sunwoo Lee, Mi Zhang, Salman Avestimehr

arXiv:2304.06947v122.055 citationsh-index: 33

Originality Incremental advance

AI Analysis

This addresses scalability and efficiency issues in cross-device federated learning for applications like mobile or IoT systems, though it is an incremental improvement over existing methods.

The paper tackles the problem of stragglers and variable client availability in asynchronous federated learning by proposing TimelyFL, which adapts local training workload based on client capabilities, resulting in a 21.13% higher participation rate, 1.28x-2.89x faster convergence, and 6.25% improved test accuracy compared to FedBuff.

In cross-device Federated Learning (FL) environments, scaling synchronous FL methods is challenging as stragglers hinder the training process. Moreover, the availability of each client to join the training is highly variable over time due to system heterogeneities and intermittent connectivity. Recent asynchronous FL methods (e.g., FedBuff) have been proposed to overcome these issues by allowing slower users to continue their work on local training based on stale models and to contribute to aggregation when ready. However, we show empirically that this method can lead to a substantial drop in training accuracy as well as a slower convergence rate. The primary reason is that fast-speed devices contribute to many more rounds of aggregation while others join more intermittently or not at all, and with stale model updates. To overcome this barrier, we propose TimelyFL, a heterogeneity-aware asynchronous FL framework with adaptive partial training. During the training, TimelyFL adjusts the local training workload based on the real-time resource capabilities of each client, aiming to allow more available clients to join in the global update without staleness. We demonstrate the performance benefits of TimelyFL by conducting extensive experiments on various datasets (e.g., CIFAR-10, Google Speech, and Reddit) and models (e.g., ResNet20, VGG11, and ALBERT). In comparison with the state-of-the-art (i.e., FedBuff), our evaluations reveal that TimelyFL improves participation rate by 21.13%, harvests 1.28x - 2.89x more efficiency on convergence rate, and provides a 6.25% increment on test accuracy.

View on arXiv PDF

Similar