IV CVNov 23, 2020

Federated Semi-Supervised Learning for COVID Region Segmentation in Chest CT using Multi-National Data from China, Italy, Japan

Dong Yang, Ziyue Xu, Wenqi Li, Andriy Myronenko, Holger R. Roth, Stephanie Harmon, Sheng Xu, Baris Turkbey, Evrim Turkbey, Xiaosong Wang, Wentao Zhu, Gianpaolo Carrafiello

arXiv:2011.11750v122.7104 citations

Originality Highly original

AI Analysis

This work is significant for medical institutions and nations facing strict data privacy regulations, as it provides a solution for collaborative model training without sharing sensitive patient data, specifically for COVID-19 CT analysis.

This paper addresses the challenge of domain shift in COVID-19 region segmentation from chest CT scans across different clinical centers by proposing a federated semi-supervised learning approach. Using a multi-national dataset of 1704 scans from China, Italy, and Japan, the framework effectively utilizes both annotated and unannotated data, demonstrating improved performance compared to conventional fully supervised methods with data sharing.

The recent outbreak of COVID-19 has led to urgent needs for reliable diagnosis and management of SARS-CoV-2 infection. As a complimentary tool, chest CT has been shown to be able to reveal visual patterns characteristic for COVID-19, which has definite value at several stages during the disease course. To facilitate CT analysis, recent efforts have focused on computer-aided characterization and diagnosis, which has shown promising results. However, domain shift of data across clinical data centers poses a serious challenge when deploying learning-based models. In this work, we attempt to find a solution for this challenge via federated and semi-supervised learning. A multi-national database consisting of 1704 scans from three countries is adopted to study the performance gap, when training a model with one dataset and applying it to another. Expert radiologists manually delineated 945 scans for COVID-19 findings. In handling the variability in both the data and annotations, a novel federated semi-supervised learning technique is proposed to fully utilize all available data (with or without annotations). Federated learning avoids the need for sensitive data-sharing, which makes it favorable for institutions and nations with strict regulatory policy on data privacy. Moreover, semi-supervision potentially reduces the annotation burden under a distributed setting. The proposed framework is shown to be effective compared to fully supervised scenarios with conventional data sharing instead of model weight sharing.

View on arXiv PDF

Similar