CVApr 11, 2023

Self-supervision for medical image classification: state-of-the-art performance with ~100 labeled training samples per class

arXiv:2304.05163v223 citationsh-index: 29
Originality Incremental advance
AI Analysis

This addresses the bottleneck of data scarcity in medical image analysis, offering a practical solution for domains with limited labeled datasets, though it is incremental as it builds on existing self-supervised methods.

The paper tackled the problem of limited labeled data in medical image classification by applying self-supervised learning with the DINO framework, achieving state-of-the-art performance using only 1-10% of available labeled data and about 100 labeled samples per class across three medical imaging modalities.

Is self-supervised deep learning (DL) for medical image analysis already a serious alternative to the de facto standard of end-to-end trained supervised DL? We tackle this question for medical image classification, with a particular focus on one of the currently most limiting factors of the field: the (non-)availability of labeled data. Based on three common medical imaging modalities (bone marrow microscopy, gastrointestinal endoscopy, dermoscopy) and publicly available data sets, we analyze the performance of self-supervised DL within the self-distillation with no labels (DINO) framework. After learning an image representation without use of image labels, conventional machine learning classifiers are applied. The classifiers are fit using a systematically varied number of labeled data (1-1000 samples per class). Exploiting the learned image representation, we achieve state-of-the-art classification performance for all three imaging modalities and data sets with only a fraction of between 1% and 10% of the available labeled data and about 100 labeled samples per class.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes