CVMar 2, 2023

Towards Trustable Skin Cancer Diagnosis via Rewriting Model's Decision

Siyuan Yan, Zhen Yu, Xuelin Zhang, Dwarikanath Mahapatra, Shekhar S. Chandra, Monika Janda, Peter Soyer, Zongyuan Ge

arXiv:2303.00885v118.138 citationsh-index: 55

Originality Incremental advance

AI Analysis

This addresses trustability issues in medical AI for skin cancer diagnosis, though it is incremental as it builds on existing methods for bias correction.

The paper tackles the problem of deep neural networks relying on confounding factors in skin cancer diagnosis, introducing a human-in-the-loop framework that automatically discovers and removes these factors, improving model performance and trustworthiness without prior knowledge or full concept labels.

Deep neural networks have demonstrated promising performance on image recognition tasks. However, they may heavily rely on confounding factors, using irrelevant artifacts or bias within the dataset as the cue to improve performance. When a model performs decision-making based on these spurious correlations, it can become untrustable and lead to catastrophic outcomes when deployed in the real-world scene. In this paper, we explore and try to solve this problem in the context of skin cancer diagnosis. We introduce a human-in-the-loop framework in the model training process such that users can observe and correct the model's decision logic when confounding behaviors happen. Specifically, our method can automatically discover confounding factors by analyzing the co-occurrence behavior of the samples. It is capable of learning confounding concepts using easily obtained concept exemplars. By mapping the black-box model's feature representation onto an explainable concept space, human users can interpret the concept and intervene via first order-logic instruction. We systematically evaluate our method on our newly crafted, well-controlled skin lesion dataset and several public skin lesion datasets. Experiments show that our method can effectively detect and remove confounding factors from datasets without any prior knowledge about the category distribution and does not require fully annotated concept labels. We also show that our method enables the model to focus on clinical-related concepts, improving the model's performance and trustworthiness during model inference.

View on arXiv PDF

Similar