Counterfactual Explanations for Oblique Decision Trees: Exact, Efficient Algorithms
This work addresses the need for interpretable and actionable explanations in critical domains like finance and medicine, though it is incremental by extending existing counterfactual methods to a specific classifier type.
The paper tackles the problem of generating counterfactual explanations for oblique decision trees, which involves minimally adjusting input features to change a classifier's decision, and demonstrates that an exact solution can be computed efficiently even with high-dimensional and mixed feature types.
We consider counterfactual explanations, the problem of minimally adjusting features in a source input instance so that it is classified as a target class under a given classifier. This has become a topic of recent interest as a way to query a trained model and suggest possible actions to overturn its decision. Mathematically, the problem is formally equivalent to that of finding adversarial examples, which also has attracted significant attention recently. Most work on either counterfactual explanations or adversarial examples has focused on differentiable classifiers, such as neural nets. We focus on classification trees, both axis-aligned and oblique (having hyperplane splits). Although here the counterfactual optimization problem is nonconvex and nondifferentiable, we show that an exact solution can be computed very efficiently, even with high-dimensional feature vectors and with both continuous and categorical features, and demonstrate it in different datasets and settings. The results are particularly relevant for finance, medicine or legal applications, where interpretability and counterfactual explanations are particularly important.