Measuring Implicit Bias Using SHAP Feature Importance and Fuzzy Cognitive Maps
This work addresses fairness issues in machine learning for researchers and practitioners, but it is incremental as it builds on existing methods like SHAP and Fuzzy Cognitive Maps.
The paper tackles the problem of measuring implicit bias in pattern classification by integrating SHAP feature importance with Fuzzy Cognitive Maps, finding that feature importance methods alone are risky for bias measurement and that bias levels differ based on feature encoding types.
In this paper, we integrate the concepts of feature importance with implicit bias in the context of pattern classification. This is done by means of a three-step methodology that involves (i) building a classifier and tuning its hyperparameters, (ii) building a Fuzzy Cognitive Map model able to quantify implicit bias, and (iii) using the SHAP feature importance to active the neural concepts when performing simulations. The results using a real case study concerning fairness research support our two-fold hypothesis. On the one hand, it is illustrated the risks of using a feature importance method as an absolute tool to measure implicit bias. On the other hand, it is concluded that the amount of bias towards protected features might differ depending on whether the features are numerically or categorically encoded.