A Patient-Centric Dataset of Images and Metadata for Identifying Melanomas Using Clinical Context
This addresses the problem of improving melanoma detection accuracy for dermatologists by providing contextual data, though it is incremental as it builds on existing datasets.
The authors tackled the gap between AI skin lesion classification and clinical practice by creating a dataset that includes patient-level context, linking multiple lesions per patient, which is used by dermatologists for diagnosis. The dataset contains 33,126 images from 2,056 patients, with 584 confirmed melanomas.
Prior skin image datasets have not addressed patient-level information obtained from multiple skin lesions from the same patient. Though artificial intelligence classification algorithms have achieved expert-level performance in controlled studies examining single images, in practice dermatologists base their judgment holistically from multiple lesions on the same patient. The 2020 SIIM-ISIC Melanoma Classification challenge dataset described herein was constructed to address this discrepancy between prior challenges and clinical practice, providing for each image in the dataset an identifier allowing lesions from the same patient to be mapped to one another. This patient-level contextual information is frequently used by clinicians to diagnose melanoma and is especially useful in ruling out false positives in patients with many atypical nevi. The dataset represents 2,056 patients from three continents with an average of 16 lesions per patient, consisting of 33,126 dermoscopic images and 584 histopathologically confirmed melanomas compared with benign melanoma mimickers.