MLLGAPSep 12, 2024

Federated One-Shot Ensemble Clustering

arXiv:2409.08396v1h-index: 6
Originality Incremental advance
AI Analysis

This provides a scalable and practical solution for multi-site clustering in real-world applications with strict communication and privacy constraints, such as healthcare, though it is incremental in combining existing techniques.

The paper tackles the problem of clustering data across multiple institutions with data-sharing restrictions by introducing the Federated One-shot Ensemble Clustering (FONT) algorithm, which requires only one communication round and exchanges only model parameters and labels to ensure privacy, resulting in improved consistency of patient clusters across health systems compared to existing methods.

Cluster analysis across multiple institutions poses significant challenges due to data-sharing restrictions. To overcome these limitations, we introduce the Federated One-shot Ensemble Clustering (FONT) algorithm, a novel solution tailored for multi-site analyses under such constraints. FONT requires only a single round of communication between sites and ensures privacy by exchanging only fitted model parameters and class labels. The algorithm combines locally fitted clustering models into a data-adaptive ensemble, making it broadly applicable to various clustering techniques and robust to differences in cluster proportions across sites. Our theoretical analysis validates the effectiveness of the data-adaptive weights learned by FONT, and simulation studies demonstrate its superior performance compared to existing benchmark methods. We applied FONT to identify subgroups of patients with rheumatoid arthritis across two health systems, revealing improved consistency of patient clusters across sites, while locally fitted clusters proved less transferable. FONT is particularly well-suited for real-world applications with stringent communication and privacy constraints, offering a scalable and practical solution for multi-site clustering.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes