DCAILGApr 29, 2021

From Distributed Machine Learning to Federated Learning: A Survey

arXiv:2104.14362v4343 citations
Originality Synthesis-oriented
AI Analysis

It addresses the challenge of enabling collaborative model training across decentralized data sources without direct data sharing, which is crucial for organizations and users under strict data regulations.

This survey paper tackles the problem of leveraging distributed data and computing resources for machine learning while adhering to legal and privacy constraints, by providing a comprehensive overview of federated learning, including its architecture, techniques, and future directions.

In recent years, data and computing resources are typically distributed in the devices of end users, various regions or organizations. Because of laws or regulations, the distributed data and computing resources cannot be directly shared among different regions or organizations for machine learning tasks. Federated learning emerges as an efficient approach to exploit distributed data and computing resources, so as to collaboratively train machine learning models, while obeying the laws and regulations and ensuring data security and data privacy. In this paper, we provide a comprehensive survey of existing works for federated learning. We propose a functional architecture of federated learning systems and a taxonomy of related techniques. Furthermore, we present the distributed training, data communication, and security of FL systems. Finally, we analyze their limitations and propose future research directions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes