ML LGApr 23, 2020

Multi-Objective Counterfactual Explanations

Susanne Dandl, Christoph Molnar, Martin Binder, Bernd Bischl

arXiv:2004.11165v2318 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of interpretability in ML for users needing detailed explanations, though it is incremental as it builds on existing counterfactual methods by introducing multi-objective optimization.

The paper tackles the challenge of balancing multiple objectives in counterfactual explanations for black-box ML models by proposing the Multi-Objective Counterfactuals (MOC) method, which generates a diverse set of counterfactuals with different trade-offs and maintains feature space diversity, enabling better post-hoc analysis and actionable user responses.

Counterfactual explanations are one of the most popular methods to make predictions of black box machine learning models interpretable by providing explanations in the form of `what-if scenarios'. Most current approaches optimize a collapsed, weighted sum of multiple objectives, which are naturally difficult to balance a-priori. We propose the Multi-Objective Counterfactuals (MOC) method, which translates the counterfactual search into a multi-objective optimization problem. Our approach not only returns a diverse set of counterfactuals with different trade-offs between the proposed objectives, but also maintains diversity in feature space. This enables a more detailed post-hoc analysis to facilitate better understanding and also more options for actionable user responses to change the predicted outcome. Our approach is also model-agnostic and works for numerical and categorical input features. We show the usefulness of MOC in concrete cases and compare our approach with state-of-the-art methods for counterfactual explanations.

View on arXiv PDF

Similar