CV AI LGAug 12, 2025

What Can We Learn from Inter-Annotator Variability in Skin Lesion Segmentation?

Kumar Abhishek, Jeremy Kawahara, Ghassan Hamarneh

arXiv:2508.09381v18.42 citationsh-index: 15Has CodeISIC@MICCAI

Originality Incremental advance

AI Analysis

This work addresses segmentation reliability in medical imaging for dermatology, offering incremental improvements by leveraging variability as a predictive feature.

The study tackled the problem of inter-annotator variability in skin lesion segmentation by analyzing its association with malignancy and using it as a soft clinical feature, resulting in a 4.2% improvement in balanced accuracy across datasets.

Medical image segmentation exhibits intra- and inter-annotator variability due to ambiguous object boundaries, annotator preferences, expertise, and tools, among other factors. Lesions with ambiguous boundaries, e.g., spiculated or infiltrative nodules, or irregular borders per the ABCD rule, are particularly prone to disagreement and are often associated with malignancy. In this work, we curate IMA++, the largest multi-annotator skin lesion segmentation dataset, on which we conduct an in-depth study of variability due to annotator, malignancy, tool, and skill factors. We find a statistically significant (p<0.001) association between inter-annotator agreement (IAA), measured using Dice, and the malignancy of skin lesions. We further show that IAA can be accurately predicted directly from dermoscopic images, achieving a mean absolute error of 0.108. Finally, we leverage this association by utilizing IAA as a "soft" clinical feature within a multi-task learning objective, yielding a 4.2% improvement in balanced accuracy averaged across multiple model architectures and across IMA++ and four public dermoscopic datasets. The code is available at https://github.com/sfu-mial/skin-IAV.

View on arXiv PDF Code

Similar