LGSep 4, 2025

On Aligning Prediction Models with Clinical Experiential Learning: A Prostate Cancer Case Study

Jacqueline J. Vallon, William Overman, Wanqiao Xu, Neil Panjwani, Xi Ling, Sushmita Vij, Hilary P. Bagshaw, John T. Leppert, Sumit Shah, Geoffrey Sonn, Sandy Srinivas, Erqi Pollom

arXiv:2509.04053v11 citationsh-index: 45

Originality Incremental advance

AI Analysis

This addresses the problem of model interpretability and alignment with expert knowledge for clinicians in healthcare, though it is incremental as it builds on existing methods for incorporating constraints.

The paper tackles the misalignment between machine learning model predictions and clinical experiential knowledge in healthcare, specifically for prostate cancer outcome prediction, by incorporating clinical constraints into models without compromising performance, and demonstrates that differences in model predictions become more interpretable to clinicians as the divergence between constrained and unconstrained models increases.

Over the past decade, the use of machine learning (ML) models in healthcare applications has rapidly increased. Despite high performance, modern ML models do not always capture patterns the end user requires. For example, a model may predict a non-monotonically decreasing relationship between cancer stage and survival, keeping all other features fixed. In this paper, we present a reproducible framework for investigating this misalignment between model behavior and clinical experiential learning, focusing on the effects of underspecification of modern ML pipelines. In a prostate cancer outcome prediction case study, we first identify and address these inconsistencies by incorporating clinical knowledge, collected by a survey, via constraints into the ML model, and subsequently analyze the impact on model performance and behavior across degrees of underspecification. The approach shows that aligning the ML model with clinical experiential learning is possible without compromising performance. Motivated by recent literature in generative AI, we further examine the feasibility of a feedback-driven alignment approach in non-generative AI clinical risk prediction models through a randomized experiment with clinicians. Our findings illustrate that, by eliciting clinicians' model preferences using our proposed methodology, the larger the difference in how the constrained and unconstrained models make predictions for a patient, the more apparent the difference is in clinical interpretation.

View on arXiv PDF

Similar