Shapley value-based approaches to explain the robustness of classifiers in machine learning
This work addresses the need for interpretability in model robustness for analysts, enabling feature selection and identification of critical or negative contributions, though it is incremental as it applies an existing method to a new aspect of explanation.
The paper tackles the problem of explaining model robustness in machine learning by proposing Shapley values to quantify each feature's contribution to robustness metrics like ROC curves and AUC, and extends this to Precision-Recall Curves for imbalanced datasets.
The use of algorithm-agnostic approaches is an emerging area of research for explaining the contribution of individual features towards the predicted outcome. Whilst there is a focus on explaining the prediction itself, a little has been done on explaining the robustness of these models, that is, how each feature contributes towards achieving that robustness. In this paper, we propose the use of Shapley values to explain the contribution of each feature towards the model's robustness, measured in terms of Receiver-operating Characteristics (ROC) curve and the Area under the ROC curve (AUC). With the help of an illustrative example, we demonstrate the proposed idea of explaining the ROC curve, and visualising the uncertainties in these curves. For imbalanced datasets, the use of Precision-Recall Curve (PRC) is considered more appropriate, therefore we also demonstrate how to explain the PRCs with the help of Shapley values. The explanation of robustness can help analysts in a number of ways, for example, it can help in feature selection by identifying the irrelevant features that can be removed to reduce the computational complexity. It can also help in identifying the features having critical contributions or negative contributions towards robustness.