Flexible Group Fairness Metrics for Survival Analysis
This work addresses fairness in survival analysis, which is crucial for sensitive applications like medical diagnosis, but it is incremental as it applies existing fairness concepts to an underexplored domain.
The paper tackled the problem of measuring algorithmic bias in survival analysis, a prediction task for event probabilities over time, by evaluating existing survival metrics with group fairness metrics across 29 datasets and 8 measures, finding that discrimination measures capture bias well while calibration and scoring rules show less clarity.
Algorithmic fairness is an increasingly important field concerned with detecting and mitigating biases in machine learning models. There has been a wealth of literature for algorithmic fairness in regression and classification however there has been little exploration of the field for survival analysis. Survival analysis is the prediction task in which one attempts to predict the probability of an event occurring over time. Survival predictions are particularly important in sensitive settings such as when utilising machine learning for diagnosis and prognosis of patients. In this paper we explore how to utilise existing survival metrics to measure bias with group fairness metrics. We explore this in an empirical experiment with 29 survival datasets and 8 measures. We find that measures of discrimination are able to capture bias well whereas there is less clarity with measures of calibration and scoring rules. We suggest further areas for research including prediction-based fairness metrics for distribution predictions.