Generation of Multimodal Justification Using Visual Word Constraint Model for Explainable Computer-Aided Diagnosis
This work addresses the need for interpretability in medical AI systems to boost confidence, though it appears incremental as it builds on existing explainable AI methods for a specific domain.
The authors tackled the problem of ambiguous decision-making in deep learning-based medical diagnosis by proposing a novel network that generates both visual pointing maps and diagnostic sentences to justify results, achieving more accurate explanations with various textual justifications in breast mass diagnosis.
The ambiguity of the decision-making process has been pointed out as the main obstacle to applying the deep learning-based method in a practical way in spite of its outstanding performance. Interpretability could guarantee the confidence of deep learning system, therefore it is particularly important in the medical field. In this study, a novel deep network is proposed to explain the diagnostic decision with visual pointing map and diagnostic sentence justifying result simultaneously. For the purpose of increasing the accuracy of sentence generation, a visual word constraint model is devised in training justification generator. To verify the proposed method, comparative experiments were conducted on the problem of the diagnosis of breast masses. Experimental results demonstrated that the proposed deep network could explain diagnosis more accurately with various textual justifications.