DCFL: Non-IID awareness Data Condensation aided Federated Learning
This addresses the problem of reduced accuracy in federated learning due to non-IID client data, offering an incremental improvement by integrating dataset condensation with privacy preservation.
The paper tackles the challenge of non-IID data in federated learning by proposing DCFL, which groups clients using Centered Kernel Alignment and applies dataset condensation with non-IID awareness to complement data within groups, achieving competitive performance on benchmarks like MNIST, FashionMNIST, SVHN, and CIFAR-10.
Federated learning is a decentralized learning paradigm wherein a central server trains a global model iteratively by utilizing clients who possess a certain amount of private datasets. The challenge lies in the fact that the client side private data may not be identically and independently distributed, significantly impacting the accuracy of the global model. Existing methods commonly address the Non-IID challenge by focusing on optimization, client selection and data complement. However, most approaches tend to overlook the perspective of the private data itself due to privacy constraints.Intuitively, statistical distinctions among private data on the client side can help mitigate the Non-IID degree. Besides, the recent advancements in dataset condensation technology have inspired us to investigate its potential applicability in addressing Non-IID issues while maintaining privacy. Motivated by this, we propose DCFL which divides clients into groups by using the Centered Kernel Alignment (CKA) method, then uses dataset condensation methods with non-IID awareness to complete clients. The private data from clients within the same group is complementary and their condensed data is accessible to all clients in the group. Additionally, CKA-guided client selection strategy, filtering mechanisms, and data enhancement techniques are incorporated to efficiently and precisely utilize the condensed data, enhance model performance, and minimize communication time. Experimental results demonstrate that DCFL achieves competitive performance on popular federated learning benchmarks including MNIST, FashionMNIST, SVHN, and CIFAR-10 with existing FL protocol.