CatFedAvg: Optimising Communication-efficiency and Classification Accuracy in Federated Learning
This work addresses communication and data imbalance issues in federated learning for applications like vision classification, but it is incremental as it builds on the FedAvg framework.
The authors tackled the dual challenges of communication efficiency and data heterogeneity in federated learning by introducing CatFedAvg, a family of algorithms that improved accuracy by 10% absolute points on MNIST while reducing network transfer by 70% absolute points compared to FedAvg.
Federated learning has allowed the training of statistical models over remote devices without the transfer of raw client data. In practice, training in heterogeneous and large networks introduce novel challenges in various aspects like network load, quality of client data, security and privacy. Recent works in FL have worked on improving communication efficiency and addressing uneven client data distribution independently, but none have provided a unified solution for both challenges. We introduce a new family of Federated Learning algorithms called CatFedAvg which not only improves the communication efficiency but improves the quality of learning using a category coverage maximization strategy. We use the FedAvg framework and introduce a simple and efficient step every epoch to collect meta-data about the client's training data structure which the central server uses to request a subset of weight updates. We explore two distinct variations which allow us to further explore the tradeoffs between communication efficiency and model accuracy. Our experiments based on a vision classification task have shown that an increase of 10% absolute points in accuracy using the MNIST dataset with 70% absolute points lower network transfer over FedAvg. We also run similar experiments with Fashion MNIST, KMNIST-10, KMNIST-49 and EMNIST-47. Further, under extreme data imbalance experiments for both globally and individual clients, we see the model performing better than FedAvg. The ablation study further explores its behaviour under varying data and client parameter conditions showcasing the robustness of the proposed approach.