LGAug 27, 2022

Lottery Aware Sparsity Hunting: Enabling Federated Learning on Resource-Limited Edge

Sara Babakniya, Souvik Kundu, Saurav Prakash, Yue Niu, Salman Avestimehr

arXiv:2208.13092v311.819 citationsh-index: 33Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of enabling efficient federated learning for edge devices with varying resource constraints, representing an incremental improvement over existing sparse learning approaches.

The paper tackles the problem of deploying federated learning on resource-limited edge devices by proposing FLASH, a unified sparse learning framework that trains a sparse sub-model to maintain performance under ultra-low parameter density while reducing communication costs, achieving up to ~10.1% improved accuracy with ~10.26x fewer communications compared to existing methods.

Edge devices can benefit remarkably from federated learning due to their distributed nature; however, their limited resource and computing power poses limitations in deployment. A possible solution to this problem is to utilize off-the-shelf sparse learning algorithms at the clients to meet their resource budget. However, such naive deployment in the clients causes significant accuracy degradation, especially for highly resource-constrained clients. In particular, our investigations reveal that the lack of consensus in the sparsity masks among the clients may potentially slow down the convergence of the global model and cause a substantial accuracy drop. With these observations, we present \textit{federated lottery aware sparsity hunting} (FLASH), a unified sparse learning framework for training a sparse sub-model that maintains the performance under ultra-low parameter density while yielding proportional communication benefits. Moreover, given that different clients may have different resource budgets, we present \textit{hetero-FLASH} where clients can take different density budgets based on their device resource limitations instead of supporting only one target parameter density. Experimental analysis on diverse models and datasets shows the superiority of FLASH in closing the gap with an unpruned baseline while yielding up to $\mathord{\sim}10.1\%$ improved accuracy with $\mathord{\sim}10.26\times$ fewer communication, compared to existing alternatives, at similar hyperparameter settings. Code is available at \url{https://github.com/SaraBabakN/flash_fl}.

View on arXiv PDF Code

Similar