LGSep 15, 2022
From algorithms to action: improving patient care requires causalityWouter A. C. van Amsterdam, Pim A. de Jong, Joost J. C. Verhoeff et al.
In cancer research there is much interest in building and validating outcome predicting outcomes to support treatment decisions. However, because most outcome prediction models are developed and validated without regard to the causal aspects of treatment decision making, many published outcome prediction models may cause harm when used for decision making, despite being found accurate in validation studies. Guidelines on prediction model validation and the checklist for risk model endorsement by the American Joint Committee on Cancer do not protect against prediction models that are accurate during development and validation but harmful when used for decision making. We explain why this is the case and how to build and validate models that are useful for decision making.
MESep 2, 2024
A causal viewpoint on prediction model performance under changes in case-mix: discrimination and calibration respond differently for prognosis and diagnosis predictionsWouter A. C. van Amsterdam
Prediction models need reliable predictive performance as they inform clinical decisions, aiding in diagnosis, prognosis, and treatment planning. The predictive performance of these models is typically assessed through discrimination and calibration. Changes in the distribution of the data impact model performance and there may be important changes between a model's current application and when and where its performance was last evaluated. In health-care, a typical change is a shift in case-mix. For example, for cardiovascular risk management, a general practitioner sees a different mix of patients than a specialist in a tertiary hospital. This work introduces a novel framework that differentiates the effects of case-mix shifts on discrimination and calibration based on the causal direction of the prediction task. When prediction is in the causal direction (often the case for prognosis predictions), calibration remains stable under case-mix shifts, while discrimination does not. Conversely, when predicting in the anti-causal direction (often with diagnosis predictions), discrimination remains stable, but calibration does not. A simulation study and empirical validation using cardiovascular disease prediction models demonstrate the implications of this framework. The causal case-mix framework provides insights for developing, evaluating and deploying prediction models across different clinical settings, emphasizing the importance of understanding the causal structure of the prediction task.
LGMar 20, 2025
Time After Time: Deep-Q Effect Estimation for Interventions on When and What to doYoav Wald, Mark Goldstein, Yonathan Efroni et al.
Problems in fields such as healthcare, robotics, and finance requires reasoning about the value both of what decision or action to take and when to take it. The prevailing hope is that artificial intelligence will support such decisions by estimating the causal effect of policies such as how to treat patients or how to allocate resources over time. However, existing methods for estimating the effect of a policy struggle with \emph{irregular time}. They either discretize time, or disregard the effect of timing policies. We present a new deep-Q algorithm that estimates the effect of both when and what to do called Earliest Disagreement Q-Evaluation (EDQ). EDQ makes use of recursion for the Q-function that is compatible with flexible sequence models, such as transformers. EDQ provides accurate estimates under standard assumptions. We validate the approach through experiments on survival time and tumor growth tasks.
LGApr 1, 2019
Controlling for Biasing Signals in Images for Prognostic Models: Survival Predictions for Lung Cancer with Deep LearningWouter A. C. van Amsterdam, Marinus J. C. Eijkemans
Deep learning has shown remarkable results for image analysis and is expected to aid individual treatment decisions in health care. To achieve this, deep learning methods need to be promoted from the level of mere associations to being able to answer causal questions. We present a scenario with real-world medical images (CT-scans of lung cancers) and simulated outcome data. Through the sampling scheme, the images contain two distinct factors of variation that represent a collider and a prognostic factor. We show that when this collider can be quantified, unbiased individual prognosis predictions are attainable with deep learning. This is achieved by (1) setting a dual task for the network to predict both the outcome and the collider and (2) enforcing independence of the activation distributions of the last layer with ordinary least squares. Our method provides an example of combining deep learning and structural causal models for unbiased individual prognosis predictions.