Sajid Siraj

2papers

2 Papers

LGSep 9, 2022
Shapley value-based approaches to explain the robustness of classifiers in machine learning

Guilherme Dean Pelegrina, Sajid Siraj

The use of algorithm-agnostic approaches is an emerging area of research for explaining the contribution of individual features towards the predicted outcome. Whilst there is a focus on explaining the prediction itself, a little has been done on explaining the robustness of these models, that is, how each feature contributes towards achieving that robustness. In this paper, we propose the use of Shapley values to explain the contribution of each feature towards the model's robustness, measured in terms of Receiver-operating Characteristics (ROC) curve and the Area under the ROC curve (AUC). With the help of an illustrative example, we demonstrate the proposed idea of explaining the ROC curve, and visualising the uncertainties in these curves. For imbalanced datasets, the use of Precision-Recall Curve (PRC) is considered more appropriate, therefore we also demonstrate how to explain the PRCs with the help of Shapley values. The explanation of robustness can help analysts in a number of ways, for example, it can help in feature selection by identifying the irrelevant features that can be removed to reduce the computational complexity. It can also help in identifying the features having critical contributions or negative contributions towards robustness.

CYOct 21, 2022
How fair were COVID-19 restriction decisions? A data-driven investigation of England using the dominance-based rough sets approach

Edward Abel, Sajid Siraj

During the COVID-19 pandemic, several countries have taken the approach of tiered restrictions which has remained a point of debate due to a lack of transparency. Using the dominance-based rough set approach, we identify patterns in the COVID-19 data pertaining to the UK government's tiered restrictions allocation system. These insights from the analysis are translated into "if-then" type rules, which can easily be interpreted by policy makers. The differences in the rules extracted from different geographical areas suggest inconsistencies in the allocations of tiers in these areas. We found that the differences delineated an overall north south divide in England, however, this divide was driven mostly by London. Based on our analysis, we demonstrate the usefulness of the dominance-based rough sets approach for investigating the fairness and explainabilty of decision making regarding COVID-19 restrictions. The proposed approach and analysis could provide a more transparent approach to localised public health restrictions, which can help ensure greater conformity to the public safety rules.