MLLGMEMay 24, 2024

A Systematic Bias of Machine Learning Regression Models and Its Correction: an Application to Imaging-based Brain Age Prediction

arXiv:2405.15950v22 citationsh-index: 4
Originality Incremental advance
AI Analysis

This addresses a fundamental issue in regression modeling that affects accuracy across domains, particularly in medical imaging applications like brain age prediction, though it is incremental as it builds on known bias problems.

The paper tackles the systematic bias in machine learning regression models, where predictions deviate from actual values, especially for extreme outcomes, and proposes a constrained optimization method to correct it, demonstrating effectiveness in simulations and neuroimaging-based brain age prediction.

Machine learning models for continuous outcomes often yield systematically biased predictions, particularly for values that largely deviate from the mean. Specifically, predictions for large-valued outcomes tend to be negatively biased (underestimating actual values), while those for small-valued outcomes are positively biased (overestimating actual values). We refer to this linear central tendency warped bias as the "systematic bias of machine learning regression". In this paper, we first demonstrate that this systematic prediction bias persists across various machine learning regression models, and then delve into its theoretical underpinnings. To address this issue, we propose a general constrained optimization approach designed to correct this bias and develop computationally efficient implementation algorithms. Simulation results indicate that our correction method effectively eliminates the bias from the predicted outcomes. We apply the proposed approach to the prediction of brain age using neuroimaging data. In comparison to competing machine learning regression models, our method effectively addresses the longstanding issue of "systematic bias of machine learning regression" in neuroimaging-based brain age calculation, yielding unbiased predictions of brain age.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes