C-SENN: Contrastive Self-Explaining Neural Network
This addresses the need for better interpretable AI in real-world applications where explanations are required, but it appears incremental as it builds on existing SENN methods.
The paper tackles the problem of reduced interpretability in self-explaining neural networks (SENN) in general settings like autonomous driving by combining contrastive learning with concept learning, resulting in improved readability of concepts and task accuracy.
In this study, we use a self-explaining neural network (SENN), which learns unsupervised concepts, to acquire concepts that are easy for people to understand automatically. In concept learning, the hidden layer retains verbalizable features relevant to the output, which is crucial when adapting to real-world environments where explanations are required. However, it is known that the interpretability of concepts output by SENN is reduced in general settings, such as autonomous driving scenarios. Thus, this study combines contrastive learning with concept learning to improve the readability of concepts and the accuracy of tasks. We call this model Contrastive Self-Explaining Neural Network (C-SENN).