Interpretable Credit Application Predictions With Counterfactual Explanations
This work addresses interpretability issues in credit scoring for financial institutions, but it is incremental as it builds on existing counterfactual methods.
The paper tackles the problem of making credit application predictions more interpretable by proposing positive counterfactuals and weighting strategies to generate smaller, more interpretable explanations, showing that their approach outperforms baselines on the HELOC dataset.
We predict credit applications with off-the-shelf, interchangeable black-box classifiers and we explain single predictions with counterfactual explanations. Counterfactual explanations expose the minimal changes required on the input data to obtain a different result e.g., approved vs rejected application. Despite their effectiveness, counterfactuals are mainly designed for changing an undesired outcome of a prediction i.e. loan rejected. Counterfactuals, however, can be difficult to interpret, especially when a high number of features are involved in the explanation. Our contribution is two-fold: i) we propose positive counterfactuals, i.e. we adapt counterfactual explanations to also explain accepted loan applications, and ii) we propose two weighting strategies to generate more interpretable counterfactuals. Experiments on the HELOC loan applications dataset show that our contribution outperforms the baseline counterfactual generation strategy, by leading to smaller and hence more interpretable counterfactuals.