NELGFeb 3, 2025

A Novel Multi-Objective Evolutionary Algorithm for Counterfactual Generation

arXiv:2502.10418v1h-index: 2
Originality Incremental advance
AI Analysis

This work addresses the need for interpretable AI in high-stakes decisions like loan approvals, though it is incremental by building on existing evolutionary approaches.

The paper tackles the problem of generating counterfactual explanations for black-box machine learning models by proposing a multi-objective evolutionary algorithm based on lexicographic optimization and extending the validity objective to include resilience to monotonicity constraints. Experiments across 15 settings showed the algorithm is competitive with existing methods and substantially increases counterfactual validity.

Machine learning algorithms that learn black-box predictive models (which cannot be directly interpreted) are increasingly used to make predictions affecting the lives of people. It is important that users understand the predictions of such models, particularly when the model outputs a negative prediction for the user (e.g. denying a loan). Counterfactual explanations provide users with guidance on how to change some of their characteristics to receive a different, positive classification by a predictive model. For example, if a predictive model rejected a loan application from a user, a counterfactual explanation might state: If your salary was £50,000 (rather than your current £35,000), then your loan would be approved. This paper proposes two novel contributions: (a) a novel multi-objective Evolutionary Algorithm (EA) for counterfactual generation based on lexicographic optimisation, rather than the more popular Pareto dominance approach; and (b) an extension to the definition of the objective of validity for a counterfactual, based on measuring the resilience of a counterfactual to violations of monotonicity constraints which are intuitively expected by users; e.g., intuitively, the probability of a loan application to be approved would monotonically increase with an increase in the salary of the applicant. Experiments involving 15 experimental settings (3 types of black box models times 5 datasets) have shown that the proposed lexicographic optimisation-based EA is very competitive with an existing Pareto dominance-based EA; and the proposed extension of the validity objective has led to a substantial increase in the validity of the counterfactuals generated by the proposed EA.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes