IVSep 7, 2023
Evaluating Deep Learning-based Melanoma Classification using Immunohistochemistry and Routine Histology: A Three Center StudyChristoph Wies, Lucas Schneider, Sarah Haggenmueller et al.
Pathologists routinely use immunohistochemical (IHC)-stained tissue slides against MelanA in addition to hematoxylin and eosin (H&E)-stained slides to improve their accuracy in diagnosing melanomas. The use of diagnostic Deep Learning (DL)-based support systems for automated examination of tissue morphology and cellular composition has been well studied in standard H&E-stained tissue slides. In contrast, there are few studies that analyze IHC slides using DL. Therefore, we investigated the separate and joint performance of ResNets trained on MelanA and corresponding H&E-stained slides. The MelanA classifier achieved an area under receiver operating characteristics curve (AUROC) of 0.82 and 0.74 on out of distribution (OOD)-datasets, similar to the H&E-based benchmark classification of 0.81 and 0.75, respectively. A combined classifier using MelanA and H&E achieved AUROCs of 0.85 and 0.81 on the OOD datasets. DL MelanA-based assistance systems show the same performance as the benchmark H&E classification and may be improved by multi stain classification to assist pathologists in their clinical routine.
AISep 20, 2024
Dermatologist-like explainable AI enhances melanoma diagnosis accuracy: eye-tracking studyTirtha Chanda, Sarah Haggenmueller, Tabea-Clara Bucher et al.
Artificial intelligence (AI) systems have substantially improved dermatologists' diagnostic accuracy for melanoma, with explainable AI (XAI) systems further enhancing clinicians' confidence and trust in AI-driven decisions. Despite these advancements, there remains a critical need for objective evaluation of how dermatologists engage with both AI and XAI tools. In this study, 76 dermatologists participated in a reader study, diagnosing 16 dermoscopic images of melanomas and nevi using an XAI system that provides detailed, domain-specific explanations. Eye-tracking technology was employed to assess their interactions. Diagnostic performance was compared with that of a standard AI system lacking explanatory features. Our findings reveal that XAI systems improved balanced diagnostic accuracy by 2.8 percentage points relative to standard AI. Moreover, diagnostic disagreements with AI/XAI systems and complex lesions were associated with elevated cognitive load, as evidenced by increased ocular fixations. These insights have significant implications for clinical practice, the design of AI tools for visual tasks, and the broader development of XAI in medical diagnostics.