Do intermediate feature coalitions aid explainability of black-box models?
This work addresses the challenge of interpretability for users of complex AI models, though it appears incremental as it builds on existing coalitional game theory concepts for explainability.
The paper tackles the problem of explaining black-box models by introducing intermediate concepts based on a hierarchical levels structure, which allows for generating explanations at varying levels of abstraction, as demonstrated in real-world examples like a car model and the Titanic dataset.
This work introduces the notion of intermediate concepts based on levels structure to aid explainability for black-box models. The levels structure is a hierarchical structure in which each level corresponds to features of a dataset (i.e., a player-set partition). The level of coarseness increases from the trivial set, which only comprises singletons, to the set, which only contains the grand coalition. In addition, it is possible to establish meronomies, i.e., part-whole relationships, via a domain expert that can be utilised to generate explanations at an abstract level. We illustrate the usability of this approach in a real-world car model example and the Titanic dataset, where intermediate concepts aid in explainability at different levels of abstraction.