BENN: Bias Estimation Using Deep Neural Network
This work addresses the challenge of inconsistent and expert-dependent bias detection methods for machine learning practitioners by offering a generic, automated, and comparable bias estimation tool.
This paper introduces BENN, a novel bias estimation method that utilizes a pretrained unsupervised deep neural network to estimate bias for every feature based on a model's predictions. BENN was evaluated against an ensemble of 21 existing bias estimation methods on three benchmark datasets and a proprietary churn prediction model, demonstrating aligned bias estimations without requiring domain expertise.
The need to detect bias in machine learning (ML) models has led to the development of multiple bias detection methods, yet utilizing them is challenging since each method: i) explores a different ethical aspect of bias, which may result in contradictory output among the different methods, ii) provides an output of a different range/scale and therefore, can't be compared with other methods, and iii) requires different input, and therefore a human expert needs to be involved to adjust each method according to the examined model. In this paper, we present BENN -- a novel bias estimation method that uses a pretrained unsupervised deep neural network. Given a ML model and data samples, BENN provides a bias estimation for every feature based on the model's predictions. We evaluated BENN using three benchmark datasets and one proprietary churn prediction model used by a European Telco and compared it with an ensemble of 21 existing bias estimation methods. Evaluation results highlight the significant advantages of BENN over the ensemble, as it is generic (i.e., can be applied to any ML model) and there is no need for a domain expert, yet it provides bias estimations that are aligned with those of the ensemble.