A Novel Pearson Correlation-Based Merging Algorithm for Robust Distributed Machine Learning with Heterogeneous Data
This work addresses efficiency and resilience issues in federated learning systems, representing an incremental improvement over existing methods.
The paper tackled the problem of federated learning with heterogeneous data and adverse network conditions by proposing a Pearson correlation-based merging algorithm, achieving accuracies of 0.82, 0.73, and 0.66 under normal, packet loss, and data poisoning scenarios, outperforming SCAFFOLD.
Federated learning faces significant challenges in scenarios with heterogeneous data distributions and adverse network conditions, such as delays, packet loss, and data poisoning attacks. This paper proposes a novel method based on the SCAFFOLD algorithm to improve the quality of local updates and enhance the robustness of the global model. The key idea is to form intermediary nodes by merging local models with high similarity, using the Pearson correlation coefficient as a similarity measure. The proposed merging algorithm reduces the number of local nodes while maintaining the accuracy of the global model, effectively addressing communication overhead and bandwidth consumption. Experimental results on the MNIST dataset under simulated federated learning scenarios demonstrate the method's effectiveness. After 10 rounds of training using a CNN model, the proposed approach achieved accuracies of 0.82, 0.73, and 0.66 under normal conditions, packet loss and data poisoning attacks, respectively, outperforming the baseline SCAFFOLD algorithm. These results highlight the potential of the proposed method to improve efficiency and resilience in federated learning systems.