Data-Free Evaluation of User Contributions in Federated Learning
This addresses the challenge of fair compensation and malicious user detection in federated learning for mobile device users, offering a data-free solution to a known bottleneck.
The paper tackles the problem of evaluating user contributions in federated learning without a test dataset by proposing Pairwise Correlated Agreement (PCA), which uses statistical correlations of model parameters, and shows that Fed-PCA outperforms FedAvg and baselines in accuracy on MNIST and an industrial dataset while effectively incentivizing truthful behavior.
Federated learning (FL) trains a machine learning model on mobile devices in a distributed manner using each device's private data and computing resources. A critical issues is to evaluate individual users' contributions so that (1) users' effort in model training can be compensated with proper incentives and (2) malicious and low-quality users can be detected and removed. The state-of-the-art solutions require a representative test dataset for the evaluation purpose, but such a dataset is often unavailable and hard to synthesize. In this paper, we propose a method called Pairwise Correlated Agreement (PCA) based on the idea of peer prediction to evaluate user contribution in FL without a test dataset. PCA achieves this using the statistical correlation of the model parameters uploaded by users. We then apply PCA to designing (1) a new federated learning algorithm called Fed-PCA, and (2) a new incentive mechanism that guarantees truthfulness. We evaluate the performance of PCA and Fed-PCA using the MNIST dataset and a large industrial product recommendation dataset. The results demonstrate that our Fed-PCA outperforms the canonical FedAvg algorithm and other baseline methods in accuracy, and at the same time, PCA effectively incentivizes users to behave truthfully.