CVAug 19, 2024

PolypDB: A Curated Multi-Center Dataset for Development of AI Algorithms in Colonoscopy

arXiv:2409.00045v214 citationsh-index: 36Has Code
Originality Synthesis-oriented
AI Analysis

This dataset addresses a critical gap for researchers developing AI algorithms to reduce polyp miss-rates in colonoscopy, though it is incremental as it builds on existing data collection efforts.

The authors tackled the lack of diverse public datasets for AI in colonoscopy by introducing PolypDB, a multi-center dataset with 3934 polyp images from five modalities and three countries, providing benchmarks for detection and segmentation tasks.

Colonoscopy is the primary method for examination, detection, and removal of polyps. However, challenges such as variations among the endoscopists' skills, bowel quality preparation, and the complex nature of the large intestine contribute to high polyp miss-rate. These missed polyps can develop into cancer later, underscoring the importance of improving the detection methods. To address this gap of lack of publicly available, multi-center large and diverse datasets for developing automatic methods for polyp detection and segmentation, we introduce PolypDB, a large scale publicly available dataset that contains 3934 still polyp images and their corresponding ground truth from real colonoscopy videos. PolypDB comprises images from five modalities: Blue Light Imaging (BLI), Flexible Imaging Color Enhancement (FICE), Linked Color Imaging (LCI), Narrow Band Imaging (NBI), and White Light Imaging (WLI) from three medical centers in Norway, Sweden, and Vietnam. We provide a benchmark on each modality and center, including federated learning settings using popular segmentation and detection benchmarks. PolypDB is public and can be downloaded at \url{https://osf.io/pr7ms/}. More information about the dataset, segmentation, detection, federated learning benchmark and train-test split can be found at \url{https://github.com/DebeshJha/PolypDB}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes