Underestimation Bias and Underfitting in Machine Learning
This addresses a critical gap in understanding algorithmic bias for machine learning practitioners, though it appears incremental as it builds on existing regularization concepts.
The paper investigates how classification algorithms themselves can introduce or worsen bias, particularly underestimation bias linked to regularization, which is often overlooked as researchers focus on eliminating bias from training data.
Often, what is termed algorithmic bias in machine learning will be due to historic bias in the training data. But sometimes the bias may be introduced (or at least exacerbated) by the algorithm itself. The ways in which algorithms can actually accentuate bias has not received a lot of attention with researchers focusing directly on methods to eliminate bias - no matter the source. In this paper we report on initial research to understand the factors that contribute to bias in classification algorithms. We believe this is important because underestimation bias is inextricably tied to regularization, i.e. measures to address overfitting can accentuate bias.