LG AISep 20, 2023

Clustered FedStack: Intermediate Global Models with Bayesian Information Criterion

Thanveer Shaik, Xiaohui Tao, Lin Li, Niall Higgins, Raj Gururajan, Xujuan Zhou, Jianming Yong

arXiv:2309.11044v23.812 citationsh-index: 27

Originality Incremental advance

AI Analysis

This work addresses data heterogeneity issues in federated learning for privacy-preserving AI applications, but it is incremental as it builds on an existing FedStack framework.

The paper tackles challenges in federated learning, such as non-IID and imbalanced data, by proposing a Clustered FedStack framework that clusters clients based on output layer weights using BIC to determine cluster numbers, and it outperforms baseline models with clustering mechanisms.

Federated Learning (FL) is currently one of the most popular technologies in the field of Artificial Intelligence (AI) due to its collaborative learning and ability to preserve client privacy. However, it faces challenges such as non-identically and non-independently distributed (non-IID) and data with imbalanced labels among local clients. To address these limitations, the research community has explored various approaches such as using local model parameters, federated generative adversarial learning, and federated representation learning. In our study, we propose a novel Clustered FedStack framework based on the previously published Stacked Federated Learning (FedStack) framework. The local clients send their model predictions and output layer weights to a server, which then builds a robust global model. This global model clusters the local clients based on their output layer weights using a clustering mechanism. We adopt three clustering mechanisms, namely K-Means, Agglomerative, and Gaussian Mixture Models, into the framework and evaluate their performance. We use Bayesian Information Criterion (BIC) with the maximum likelihood function to determine the number of clusters. The Clustered FedStack models outperform baseline models with clustering mechanisms. To estimate the convergence of our proposed framework, we use Cyclical learning rates.

View on arXiv PDF

Similar