Semantic interpretation for convolutional neural networks: What makes a cat a cat?
This work addresses the interpretability challenge in deep learning for researchers and practitioners, though it appears incremental as it builds on existing methods like principal component analysis and genetic algorithms.
The paper tackles the problem of interpreting convolutional neural networks by introducing a framework called semantic explainable AI (S-XAI) that extracts understandable semantic spaces, and it demonstrates effectiveness in applications like trustworthiness assessment and semantic sample searching.
The interpretability of deep neural networks has attracted increasing attention in recent years, and several methods have been created to interpret the "black box" model. Fundamental limitations remain, however, that impede the pace of understanding the networks, especially the extraction of understandable semantic space. In this work, we introduce the framework of semantic explainable AI (S-XAI), which utilizes row-centered principal component analysis to obtain the common traits from the best combination of superpixels discovered by a genetic algorithm, and extracts understandable semantic spaces on the basis of discovered semantically sensitive neurons and visualization techniques. Statistical interpretation of the semantic space is also provided, and the concept of semantic probability is proposed for the first time. Our experimental results demonstrate that S-XAI is effective in providing a semantic interpretation for the CNN, and offers broad usage, including trustworthiness assessment and semantic sample searching.