LG DCJan 21, 2025

CYCle: Choosing Your Collaborators Wisely to Enhance Collaborative Fairness in Decentralized Learning

Nurbek Tastan, Samuel Horvath, Karthik Nandakumar

arXiv:2501.12344v24 citationsh-index: 8Trans. Mach. Learn. Res.

Originality Incremental advance

AI Analysis

This addresses fairness issues in collaborative learning for participants in decentralized settings, though it is incremental as it builds on existing decentralized methods.

The paper tackles the problem of unfair collaboration gains in decentralized learning, where some clients may be negatively impacted, and proposes the CYCle protocol to ensure positive and fair gains for all participants, achieving this even with highly skewed data distributions.

Collaborative learning (CL) enables multiple participants to jointly train machine learning (ML) models on decentralized data sources without raw data sharing. While the primary goal of CL is to maximize the expected accuracy gain for each participant, it is also important to ensure that the gains are fairly distributed: no client should be negatively impacted, and gains should reflect contributions. Most existing CL methods require central coordination and focus only on gain maximization, overlooking fairness. In this work, we first show that the existing measure of collaborative fairness based on the correlation between accuracy values without and with collaboration has drawbacks because it does not account for negative collaboration gain. We argue that maximizing mean collaboration gain (MCG) while simultaneously minimizing the collaboration gain spread (CGS) is a fairer alternative. Next, we propose the CYCle protocol that enables individual participants in a private decentralized learning (PDL) framework to achieve this objective through a novel reputation scoring method based on gradient alignment between the local cross-entropy and distillation losses. We further extend the CYCle protocol to operate on top of gossip-based decentralized algorithms such as Gossip-SGD. We also theoretically show that CYCle performs better than standard FedAvg in a two-client mean estimation setting under high heterogeneity. Empirical experiments demonstrate the effectiveness of the CYCle protocol to ensure positive and fair collaboration gain for all participants, even in cases where the data distributions of participants are highly skewed.

View on arXiv PDF

Similar