Axiomatic Foundations of Counterfactual Explanations

arXiv:2602.04028v14.4

Originality Incremental advance

AI Analysis

This work addresses the problem of improving trust in AI systems by providing a foundational taxonomy for explainers, though it is incremental in formalizing existing concepts.

The paper tackles the lack of systematic study of counterfactual explanation types by introducing an axiomatic framework that proves impossibility theorems and identifies five distinct types of counterfactuals, including both local and global explanations.

Explaining autonomous and intelligent systems is critical in order to improve trust in their decisions. Counterfactuals have emerged as one of the most compelling forms of explanation. They address ``why not'' questions by revealing how decisions could be altered. Despite the growing literature, most existing explainers focus on a single type of counterfactual and are restricted to local explanations, focusing on individual instances. There has been no systematic study of alternative counterfactual types, nor of global counterfactuals that shed light on a system's overall reasoning process. This paper addresses the two gaps by introducing an axiomatic framework built on a set of desirable properties for counterfactual explainers. It proves impossibility theorems showing that no single explainer can satisfy certain axiom combinations simultaneously, and fully characterizes all compatible sets. Representation theorems then establish five one-to-one correspondences between specific subsets of axioms and the families of explainers that satisfy them. Each family gives rise to a distinct type of counterfactual explanation, uncovering five fundamentally different types of counterfactuals. Some of these correspond to local explanations, while others capture global explanations. Finally, the framework situates existing explainers within this taxonomy, formally characterizes their behavior, and analyzes the computational complexity of generating such explanations.

View on arXiv PDF

Similar