MLLGMEJan 31, 2023

Distributed sequential federated learning

arXiv:2302.00107v1h-index: 1
Originality Incremental advance
AI Analysis

This addresses data security and efficiency issues in distributed data analysis for applications like healthcare, but it is incremental as it builds on existing federated learning with a sequential approach.

The paper tackles the problem of aggregating information from multiple sites in federated learning, where averaging methods may fail due to data nonhomogeneity, by developing a sequential method that preserves properties like data-driven sample size and estimation precision in generalized linear models, as demonstrated with simulated data and a COVID-19 dataset from 32 hospitals in Mexico.

The analysis of data stored in multiple sites has become more popular, raising new concerns about the security of data storage and communication. Federated learning, which does not require centralizing data, is a common approach to preventing heavy data transportation, securing valued data, and protecting personal information protection. Therefore, determining how to aggregate the information obtained from the analysis of data in separate local sites has become an important statistical issue. The commonly used averaging methods may not be suitable due to data nonhomogeneity and incomparable results among individual sites, and applying them may result in the loss of information obtained from the individual analyses. Using a sequential method in federated learning with distributed computing can facilitate the integration and accelerate the analysis process. We develop a data-driven method for efficiently and effectively aggregating valued information by analyzing local data without encountering potential issues such as information security and heavy transportation due to data communication. In addition, the proposed method can preserve the properties of classical sequential adaptive design, such as data-driven sample size and estimation precision when applied to generalized linear models. We use numerical studies of simulated data and an application to COVID-19 data collected from 32 hospitals in Mexico, to illustrate the proposed method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes