CEAIAug 21, 2024

Federated Diabetes Prediction in Canadian Adults Using Real-world Cross-Province Primary Care Data

arXiv:2408.12029v14 citationsh-index: 3
Originality Synthesis-oriented
AI Analysis

This work addresses privacy concerns in healthcare data sharing for diabetes prediction in Canadian adults, but it is incremental as it applies existing federated learning methods to a new dataset.

The paper tackled the challenge of predicting diabetes in Canadian adults using real-world primary care data across provinces without sharing patient data, by applying federated learning to avoid privacy issues, and found that a federated MLP model performed similarly or better than a centralized model, while federated logistic regression underperformed.

Integrating Electronic Health Records (EHR) and the application of machine learning present opportunities for enhancing the accuracy and accessibility of data-driven diabetes prediction. In particular, developing data-driven machine learning models can provide early identification of patients with high risk for diabetes, potentially leading to more effective therapeutic strategies and reduced healthcare costs. However, regulation restrictions create barriers to developing centralized predictive models. This paper addresses the challenges by introducing a federated learning approach, which amalgamates predictive models without centralized data storage and processing, thus avoiding privacy issues. This marks the first application of federated learning to predict diabetes using real clinical datasets in Canada extracted from the Canadian Primary Care Sentinel Surveillance Network (CPCSSN) without crossprovince patient data sharing. We address class-imbalance issues through downsampling techniques and compare federated learning performance against province-based and centralized models. Experimental results show that the federated MLP model presents a similar or higher performance compared to the model trained with the centralized approach. However, the federated logistic regression model showed inferior performance compared to its centralized peer.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes