CLINICAL: Targeted Active Learning for Imbalanced Medical Image Classification
This work addresses the challenge of improving model performance on rare classes in medical imaging, which is crucial for accurate diagnosis in imbalanced datasets, representing an incremental advancement in active learning techniques.
The paper tackles the problem of class imbalance in medical image classification by proposing a targeted active learning framework called Clinical, which uses submodular mutual information functions to select rare class data points, and demonstrates that it outperforms state-of-the-art active learning methods in binary and long-tail imbalance scenarios.
Training deep learning models on medical datasets that perform well for all classes is a challenging task. It is often the case that a suboptimal performance is obtained on some classes due to the natural class imbalance issue that comes with medical data. An effective way to tackle this problem is by using targeted active learning, where we iteratively add data points to the training data that belong to the rare classes. However, existing active learning methods are ineffective in targeting rare classes in medical datasets. In this work, we propose Clinical (targeted aCtive Learning for ImbalaNced medICal imAge cLassification) a framework that uses submodular mutual information functions as acquisition functions to mine critical data points from rare classes. We apply our framework to a wide-array of medical imaging datasets on a variety of real-world class imbalance scenarios - namely, binary imbalance and long-tail imbalance. We show that Clinical outperforms the state-of-the-art active learning methods by acquiring a diverse set of data points that belong to the rare classes.