IVCVSep 4, 2024

Coupling AI and Citizen Science in Creation of Enhanced Training Dataset for Medical Image Segmentation

arXiv:2409.03087v21 citationsh-index: 8
AI Analysis

This work addresses the scalability issue in medical imaging for AI developers and researchers, though it is incremental as it builds on existing methods like MedSAM and pix2pixGAN.

The paper tackles the problem of limited high-quality annotated datasets for medical image segmentation by introducing a framework that combines AI and crowdsourcing to enhance dataset quality and quantity, resulting in significantly improved model performance, particularly with limited training data.

Recent advancements in medical imaging and artificial intelligence (AI) have greatly enhanced diagnostic capabilities, but the development of effective deep learning (DL) models is still constrained by the lack of high-quality annotated datasets. The traditional manual annotation process by medical experts is time- and resource-intensive, limiting the scalability of these datasets. In this work, we introduce a robust and versatile framework that combines AI and crowdsourcing to improve both the quality and quantity of medical image datasets across different modalities. Our approach utilises a user-friendly online platform that enables a diverse group of crowd annotators to label medical images efficiently. By integrating the MedSAM segmentation AI with this platform, we accelerate the annotation process while maintaining expert-level quality through an algorithm that merges crowd-labelled images. Additionally, we employ pix2pixGAN, a generative AI model, to expand the training dataset with synthetic images that capture realistic morphological features. These methods are combined into a cohesive framework designed to produce an enhanced dataset, which can serve as a universal pre-processing pipeline to boost the training of any medical deep learning segmentation model. Our results demonstrate that this framework significantly improves model performance, especially when training data is limited.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes