On Interpretability of Artificial Neural Networks: A Survey
This paper addresses the interpretability problem for researchers and practitioners in AI/ML, especially in high-stakes domains like healthcare, but it is incremental as it synthesizes existing work rather than introducing new methods.
This survey paper systematically reviews recent research on interpreting artificial neural networks, which are widely used but face acceptance barriers due to their black-box nature, particularly in critical applications like medical diagnosis. It organizes studies into a comprehensive taxonomy, describes applications in medicine, and discusses future directions including connections to fuzzy logic and brain science.
Deep learning as represented by the artificial deep neural networks (DNNs) has achieved great success in many important areas that deal with text, images, videos, graphs, and so on. However, the black-box nature of DNNs has become one of the primary obstacles for their wide acceptance in mission-critical applications such as medical diagnosis and therapy. Due to the huge potential of deep learning, interpreting neural networks has recently attracted much research attention. In this paper, based on our comprehensive taxonomy, we systematically review recent studies in understanding the mechanism of neural networks, describe applications of interpretability especially in medicine, and discuss future directions of interpretability research, such as in relation to fuzzy logic and brain science.