Pathological Regularization Regimes in Classification Tasks
This work addresses a potential pitfall for data science practitioners in hyperparameter tuning, offering tools to prevent misleading model behaviors in classification tasks.
The paper demonstrates that certain regularization parameter choices can cause a trend reversal between dataset labels and model predictions in binary classification, termed pathological regularization regimes, and provides algebraic conditions to identify and avoid such regimes in ridge regression, with numerical results for logistic regression.
In this paper we demonstrate the possibility of a trend reversal in binary classification tasks between the dataset and a classification score obtained from a trained model. This trend reversal occurs for certain choices of the regularization parameter for model training, namely, if the parameter is contained in what we call the pathological regularization regime. For ridge regression, we give necessary and sufficient algebraic conditions on the dataset for the existence of a pathological regularization regime. Moreover, our results provide a data science practitioner with a hands-on tool to avoid hyperparameter choices suffering from trend reversal. We furthermore present numerical results on pathological regularization regimes for logistic regression. Finally, we draw connections to datasets exhibiting Simpson's paradox, providing a natural source of pathological datasets.