IVCVJun 7, 2022

COVIDx CT-3: A Large-scale, Multinational, Open-Source Benchmark Dataset for Computer-aided COVID-19 Screening from Chest CT Images

arXiv:2206.03043v38 citationsh-index: 11
Originality Synthesis-oriented
AI Analysis

This dataset addresses the need for more diverse and extensive data to develop computer-aided COVID-19 screening systems, though it is incremental as it builds on existing data collection efforts.

The authors tackled the problem of limited and non-diverse CT data for COVID-19 screening by introducing COVIDx CT-3, a large-scale multinational benchmark dataset with 431,205 CT slices from 6,068 patients across at least 17 countries, which is the largest open-access dataset of its kind.

Computed tomography (CT) has been widely explored as a COVID-19 screening and assessment tool to complement RT-PCR testing. To assist radiologists with CT-based COVID-19 screening, a number of computer-aided systems have been proposed. However, many proposed systems are built using CT data which is limited in both quantity and diversity. Motivated to support efforts in the development of machine learning-driven screening systems, we introduce COVIDx CT-3, a large-scale multinational benchmark dataset for detection of COVID-19 cases from chest CT images. COVIDx CT-3 includes 431,205 CT slices from 6,068 patients across at least 17 countries, which to the best of our knowledge represents the largest, most diverse dataset of COVID-19 CT images in open-access form. Additionally, we examine the data diversity and potential biases of the COVIDx CT-3 dataset, finding that significant geographic and class imbalances remain despite efforts to curate data from a wide variety of sources.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes