IV CV LG MLSep 12, 2019

Encoding Visual Attributes in Capsules for Explainable Medical Diagnoses

Rodney LaLonde, Drew Torigian, Ulas Bagci

arXiv:1909.05926v516.110 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the need for explainable AI in medical diagnosis, particularly for radiologists, though it is incremental as it builds on existing capsule network concepts.

The authors tackled the problem of uninterpretable 'black-box' neural networks in high-risk areas like healthcare by developing X-Caps, a capsule network that encodes human-interpretable visual attributes for medical diagnoses, achieving performance approaching non-explainable 3D CNNs while outperforming a state-of-the-art 3D CNN in capturing interpretable attributes.

Convolutional neural network based systems have largely failed to be adopted in many high-risk application areas, including healthcare, military, security, transportation, finance, and legal, due to their highly uninterpretable "black-box" nature. Towards solving this deficiency, we teach a novel multi-task capsule network to improve the explainability of predictions by embodying the same high-level language used by human-experts. Our explainable capsule network, X-Caps, encodes high-level visual object attributes within the vectors of its capsules, then forms predictions based solely on these human-interpretable features. To encode attributes, X-Caps utilizes a new routing sigmoid function to independently route information from child capsules to parents. Further, to provide radiologists with an estimate of model confidence, we train our network on a distribution of expert labels, modeling inter-observer agreement and punishing over/under confidence during training, supervised by human-experts' agreement. X-Caps simultaneously learns attribute and malignancy scores from a multi-center dataset of over 1000 CT scans of lung cancer screening patients. We demonstrate a simple 2D capsule network can outperform a state-of-the-art deep dense dual-path 3D CNN at capturing visually-interpretable high-level attributes and malignancy prediction, while providing malignancy prediction scores approaching that of non-explainable 3D CNNs. To the best of our knowledge, this is the first study to investigate capsule networks for making predictions based on radiologist-level interpretable attributes and its applications to medical image diagnosis. Code is publicly available at https://github.com/lalonderodney/X-Caps .

View on arXiv PDF Code

Similar