Data-Centric Label Smoothing for Explainable Glaucoma Screening from Eye Fundus Images
This work addresses the challenge of reliable and explainable glaucoma screening for medical diagnosis, though it is incremental as it builds on existing data-centric and label smoothing methods.
The paper tackles the problem of improving glaucoma screening from retinal images by addressing inter-rater variability in annotations through a data-centric label smoothing approach, resulting in performance gains over standard models and conventional label smoothing techniques in a multi-label, imbalanced context.
As current computing capabilities increase, modern machine learning and computer vision system tend to increase in complexity, mostly by means of larger models and advanced optimization strategies. Although often neglected, in many problems there is also much to be gained by considering potential improvements in understanding and better leveraging already-available training data, including annotations. This so-called data-centric approach can lead to substantial performance increases, sometimes beyond what can be achieved by larger models. In this paper we adopt such an approach for the task of justifiable glaucoma screening from retinal images. In particular, we focus on how to combine information from multiple annotators of different skills into a tailored label smoothing scheme that allows us to better employ a large collection of fundus images, instead of discarding samples suffering from inter-rater variability. Internal validation results indicate that our bespoke label smoothing approach surpasses the performance of a standard resnet50 model and also the same model trained with conventional label smoothing techniques, in particular for the multi-label scenario of predicting clinical reasons of glaucoma likelihood in a highly imbalanced screening context. Our code is made available at github.com/agaldran/justraigs .