LGNov 12, 2024

Dual-Criterion Model Aggregation in Federated Learning: Balancing Data Quantity and Quality

Haizhou Zhang, Xianjia Yu, Tomi Westerlund

arXiv:2411.07816v14.62 citationsh-index: 12Has Code

Originality Incremental advance

AI Analysis

It addresses the challenge of data heterogeneity in federated learning for privacy-preserving collaborative systems, representing an incremental improvement over existing aggregation methods.

This paper tackles the problem of suboptimal global models in federated learning due to ignoring data heterogeneity by proposing a dual-criterion weighted aggregation algorithm that balances data quantity and quality, achieving superior performance over state-of-the-art methods on CIFAR-10 and a visual obstacle avoidance dataset.

Federated learning (FL) has become one of the key methods for privacy-preserving collaborative learning, as it enables the transfer of models without requiring local data exchange. Within the FL framework, an aggregation algorithm is recognized as one of the most crucial components for ensuring the efficacy and security of the system. Existing average aggregation algorithms typically assume that all client-trained data holds equal value or that weights are based solely on the quantity of data contributed by each client. In contrast, alternative approaches involve training the model locally after aggregation to enhance adaptability. However, these approaches fundamentally ignore the inherent heterogeneity between different clients' data and the complexity of variations in data at the aggregation stage, which may lead to a suboptimal global model. To address these issues, this study proposes a novel dual-criterion weighted aggregation algorithm involving the quantity and quality of data from the client node. Specifically, we quantify the data used for training and perform multiple rounds of local model inference accuracy evaluation on a specialized dataset to assess the data quality of each client. These two factors are utilized as weights within the aggregation process, applied through a dynamically weighted summation of these two factors. This approach allows the algorithm to adaptively adjust the weights, ensuring that every client can contribute to the global model, regardless of their data's size or initial quality. Our experiments show that the proposed algorithm outperforms several existing state-of-the-art aggregation approaches on both a general-purpose open-source dataset, CIFAR-10, and a dataset specific to visual obstacle avoidance.

View on arXiv PDF

Similar