Formal Conceptual Views in Neural Networks
This work addresses the challenge of globally explaining neural networks for AI analysts, though it appears incremental as it builds on existing explanation methods.
The paper tackles the problem of explaining neural network models by introducing two conceptual views (many-valued and symbolic) to analyze neuron knowledge, tested on ImageNet and Fruit-360 datasets, and demonstrates their use for quantifying conceptual similarity and abductive learning of human-comprehensible rules.
Explaining neural network models is a challenging task that remains unsolved in its entirety to this day. This is especially true for high dimensional and complex data. With the present work, we introduce two notions for conceptual views of a neural network, specifically a many-valued and a symbolic view. Both provide novel analysis methods to enable a human AI analyst to grasp deeper insights into the knowledge that is captured by the neurons of a network. We test the conceptual expressivity of our novel views through different experiments on the ImageNet and Fruit-360 data sets. Furthermore, we show to which extent the views allow to quantify the conceptual similarity of different learning architectures. Finally, we demonstrate how conceptual views can be applied for abductive learning of human comprehensible rules from neurons. In summary, with our work, we contribute to the most relevant task of globally explaining neural networks models.