Protea: Client Profiling within Federated Systems using Flower
This work addresses the problem of enabling efficient and scalable simulations for federated learning researchers, though it is incremental as it builds on existing frameworks.
The paper tackles the challenge of simulating large-scale federated learning systems with heterogeneous clients by designing Protea, a client profiling component using Flower, which resulted in 1.66 times faster wall-clock time and 2.6 times better GPU utilization.
Federated Learning (FL) has emerged as a prospective solution that facilitates the training of a high-performing centralised model without compromising the privacy of users. While successful, research is currently limited by the possibility of establishing a realistic large-scale FL system at the early stages of experimentation. Simulation can help accelerate this process. To facilitate efficient scalable FL simulation of heterogeneous clients, we design and implement Protea, a flexible and lightweight client profiling component within federated systems using the FL framework Flower. It allows automatically collecting system-level statistics and estimating the resources needed for each client, thus running the simulation in a resource-aware fashion. The results show that our design successfully increases parallelism for 1.66 $\times$ faster wall-clock time and 2.6$\times$ better GPU utilisation, which enables large-scale experiments on heterogeneous clients.