Self-Supervised Multiple Instance Learning for Acute Myeloid Leukemia Classification
This provides a cost-effective and data-efficient solution for AI-based disease diagnosis in medical imaging, particularly for diseases with scarce annotations, though it is incremental as it adapts existing SSL methods to a known bottleneck.
The study tackled the problem of classifying Acute Myeloid Leukemia subtypes from blood smears using Multiple Instance Learning, which requires labeled data for encoder training, by applying Self-Supervised Learning pre-training methods like SimCLR, SwAV, and DINO, and found that SSL-pretrained encoders achieved comparable performance to supervised pre-training.
Automated disease diagnosis using medical image analysis relies on deep learning, often requiring large labeled datasets for supervised model training. Diseases like Acute Myeloid Leukemia (AML) pose challenges due to scarce and costly annotations on a single-cell level. Multiple Instance Learning (MIL) addresses weakly labeled scenarios but necessitates powerful encoders typically trained with labeled data. In this study, we explore Self-Supervised Learning (SSL) as a pre-training approach for MIL-based AML subtype classification from blood smears, removing the need for labeled data during encoder training. We investigate the three state-of-the-art SSL methods SimCLR, SwAV, and DINO, and compare their performance against supervised pre-training. Our findings show that SSL-pretrained encoders achieve comparable performance, showcasing the potential of SSL in MIL. This breakthrough offers a cost-effective and data-efficient solution, propelling the field of AI-based disease diagnosis.