LG DCDec 13, 2020

Privacy-preserving Decentralized Aggregation for Federated Learning

Beomyeol Jeon, S. M. Ferdous, Muntasir Raihan Rahman, Anwar Walid

arXiv:2012.07183v27.279 citations

Originality Highly original

AI Analysis

This work provides a new method for privacy-preserving federated learning, which is important for organizations and individuals concerned about data privacy in distributed machine learning settings.

This paper proposes a privacy-preserving decentralized aggregation protocol for federated learning, addressing the privacy weaknesses of distributed aggregation. The protocol controls communication among participants to minimize privacy leakage, achieving comparable performance to standard centralized federated learning with a test accuracy degradation of only up to 0.73%.

Federated learning is a promising framework for learning over decentralized data spanning multiple regions. This approach avoids expensive central training data aggregation cost and can improve privacy because distributed sites do not have to reveal privacy-sensitive data. In this paper, we develop a privacy-preserving decentralized aggregation protocol for federated learning. We formulate the distributed aggregation protocol with the Alternating Direction Method of Multiplier (ADMM) and examine its privacy weakness. Unlike prior work that use Differential Privacy or homomorphic encryption for privacy, we develop a protocol that controls communication among participants in each round of aggregation to minimize privacy leakage. We establish its privacy guarantee against an honest-but-curious adversary. We also propose an efficient algorithm to construct such a communication pattern, inspired by combinatorial block design theory. Our secure aggregation protocol based on this novel group communication pattern design leads to an efficient algorithm for federated training with privacy guarantees. We evaluate our federated training algorithm on image classification and next-word prediction applications over benchmark datasets with 9 and 15 distributed sites. Evaluation results show that our algorithm performs comparably to the standard centralized federated learning method while preserving privacy; the degradation in test accuracy is only up to 0.73%.

View on arXiv PDF

Similar