CVAug 1, 2022

Information Gain Sampling for Active Learning in Medical Image Classification

arXiv:2208.00974v17 citationsh-index: 38
Originality Incremental advance
AI Analysis

This work addresses the challenge of reducing labeling costs for medical image classification, which is incremental as it builds on existing active learning methods with specific adaptations.

The paper tackles the problem of limited annotated medical image datasets by proposing an active learning framework that selects images to label based on expected information gain, adapted for class imbalances. It shows that this method achieves about 95% performance with only 19% of training data, outperforming baselines that require around 25%.

Large, annotated datasets are not widely available in medical image analysis due to the prohibitive time, costs, and challenges associated with labelling large datasets. Unlabelled datasets are easier to obtain, and in many contexts, it would be feasible for an expert to provide labels for a small subset of images. This work presents an information-theoretic active learning framework that guides the optimal selection of images from the unlabelled pool to be labeled based on maximizing the expected information gain (EIG) on an evaluation dataset. Experiments are performed on two different medical image classification datasets: multi-class diabetic retinopathy disease scale classification and multi-class skin lesion classification. Results indicate that by adapting EIG to account for class-imbalances, our proposed Adapted Expected Information Gain (AEIG) outperforms several popular baselines including the diversity based CoreSet and uncertainty based maximum entropy sampling. Specifically, AEIG achieves ~95% of overall performance with only 19% of the training data, while other active learning approaches require around 25%. We show that, by careful design choices, our model can be integrated into existing deep learning classifiers.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes