MLLGApr 23, 2020

Multi-Objective Counterfactual Explanations

arXiv:2004.11165v2318 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of interpretability in ML for users needing detailed explanations, though it is incremental as it builds on existing counterfactual methods by introducing multi-objective optimization.

The paper tackles the challenge of balancing multiple objectives in counterfactual explanations for black-box ML models by proposing the Multi-Objective Counterfactuals (MOC) method, which generates a diverse set of counterfactuals with different trade-offs and maintains feature space diversity, enabling better post-hoc analysis and actionable user responses.

Counterfactual explanations are one of the most popular methods to make predictions of black box machine learning models interpretable by providing explanations in the form of `what-if scenarios'. Most current approaches optimize a collapsed, weighted sum of multiple objectives, which are naturally difficult to balance a-priori. We propose the Multi-Objective Counterfactuals (MOC) method, which translates the counterfactual search into a multi-objective optimization problem. Our approach not only returns a diverse set of counterfactuals with different trade-offs between the proposed objectives, but also maintains diversity in feature space. This enables a more detailed post-hoc analysis to facilitate better understanding and also more options for actionable user responses to change the predicted outcome. Our approach is also model-agnostic and works for numerical and categorical input features. We show the usefulness of MOC in concrete cases and compare our approach with state-of-the-art methods for counterfactual explanations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes