Semantic Explanations of Predictions
This work addresses the need for better interpretability in AI systems for users who require understandable model decisions, though it appears incremental as it builds on existing explanation methods by incorporating semantic concepts and contrastive evidence.
The paper tackles the problem of generating informative explanations for machine learning predictions by selecting training data points with special characteristics, such as contrastive and representative ones, and deriving semantic concepts from them using domain ontologies to improve human understanding.
The main objective of explanations is to transmit knowledge to humans. This work proposes to construct informative explanations for predictions made from machine learning models. Motivated by the observations from social sciences, our approach selects data points from the training sample that exhibit special characteristics crucial for explanation, for instance, ones contrastive to the classification prediction and ones representative of the models. Subsequently, semantic concepts are derived from the selected data points through the use of domain ontologies. These concepts are filtered and ranked to produce informative explanations that improves human understanding. The main features of our approach are that (1) knowledge about explanations is captured in the form of ontological concepts, (2) explanations include contrastive evidences in addition to normal evidences, and (3) explanations are user relevant.