Federated Bayesian Network Ensembles
This work addresses privacy-preserving machine learning for domains with biased or imbalanced data distributions, though it is incremental as it applies ensemble methods to a federated setting.
The paper tackles the problem of training machine learning models on decentralized data with privacy constraints by proposing Federated Bayesian Network Ensembles (FBNE), which outperform local models and achieve similar performance to an existing federated method while significantly increasing training speed.
Federated learning allows us to run machine learning algorithms on decentralized data when data sharing is not permitted due to privacy concerns. Ensemble-based learning works by training multiple (weak) classifiers whose output is aggregated. Federated ensembles are ensembles applied to a federated setting, where each classifier in the ensemble is trained on one data location. In this article, we explore the use of federated ensembles of Bayesian networks (FBNE) in a range of experiments and compare their performance with locally trained models and models trained with VertiBayes, a federated learning algorithm to train Bayesian networks from decentralized data. Our results show that FBNE outperforms local models and provides a significant increase in training speed compared with VertiBayes while maintaining a similar performance in most settings, among other advantages. We show that FBNE is a potentially useful tool within the federated learning toolbox, especially when local populations are heavily biased, or there is a strong imbalance in population size across parties. We discuss the advantages and disadvantages of this approach in terms of time complexity, model accuracy, privacy protection, and model interpretability.