LGMLMar 6, 2023

Very fast, approximate counterfactual explanations for decision forests

arXiv:2303.02883v18 citationsh-index: 7
Originality Incremental advance
AI Analysis

This provides a practical solution for users needing fast and realistic counterfactual explanations in machine learning applications, though it is incremental as it builds on existing optimization methods.

The paper tackles the problem of finding counterfactual explanations for decision forests efficiently by constraining the search to data-populated regions, reducing it to a nearest-neighbor problem, which results in very fast scaling to large forests and high-dimensional data while producing realistic solutions.

We consider finding a counterfactual explanation for a classification or regression forest, such as a random forest. This requires solving an optimization problem to find the closest input instance to a given instance for which the forest outputs a desired value. Finding an exact solution has a cost that is exponential on the number of leaves in the forest. We propose a simple but very effective approach: we constrain the optimization to only those input space regions defined by the forest that are populated by actual data points. The problem reduces to a form of nearest-neighbor search using a certain distance on a certain dataset. This has two advantages: first, the solution can be found very quickly, scaling to large forests and high-dimensional data, and enabling interactive use. Second, the solution found is more likely to be realistic in that it is guided towards high-density areas of input space.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes