SPAILGOct 28, 2024

FedCVD: The First Real-World Federated Learning Benchmark on Cardiovascular Disease Data

arXiv:2411.07050v16 citationsh-index: 8Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of limited real-world FL benchmarks for healthcare researchers, though it is incremental as it focuses on creating a dataset rather than advancing FL methods.

The paper tackles the lack of real-world federated learning benchmarks for cardiovascular disease detection by introducing FedCVD, a benchmark with ECG classification and ECHO segmentation tasks based on data from seven institutions, revealing challenges with non-IID and long-tail data in FL.

Cardiovascular diseases (CVDs) are currently the leading cause of death worldwide, highlighting the critical need for early diagnosis and treatment. Machine learning (ML) methods can help diagnose CVDs early, but their performance relies on access to substantial data with high quality. However, the sensitive nature of healthcare data often restricts individual clinical institutions from sharing data to train sufficiently generalized and unbiased ML models. Federated Learning (FL) is an emerging approach, which offers a promising solution by enabling collaborative model training across multiple participants without compromising the privacy of the individual data owners. However, to the best of our knowledge, there has been limited prior research applying FL to the cardiovascular disease domain. Moreover, existing FL benchmarks and datasets are typically simulated and may fall short of replicating the complexity of natural heterogeneity found in realistic datasets that challenges current FL algorithms. To address these gaps, this paper presents the first real-world FL benchmark for cardiovascular disease detection, named FedCVD. This benchmark comprises two major tasks: electrocardiogram (ECG) classification and echocardiogram (ECHO) segmentation, based on naturally scattered datasets constructed from the CVD data of seven institutions. Our extensive experiments on these datasets reveal that FL faces new challenges with real-world non-IID and long-tail data. The code and datasets of FedCVD are available https://github.com/SMILELab-FL/FedCVD.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes