Confounder Detection via Treatment Intent: A New Observational Study Design

Drago Plecko, Patrik Okanovic, Torsten Hoefler, Elias Bareinboim

arXiv:2605.2641366.7

AI Analysis

For researchers in causal inference and observational studies, this work addresses the critical problem of unobserved confounding by proposing a human-in-the-loop method to detect hidden confounders.

The paper introduces a new study design, confounder detection via treatment intent, that queries human experts to elicit unobserved confounders by comparing matched pairs. It provides theoretical foundations and demonstrates proof-of-concept in ICU settings, showing evidence of unobserved confounding in EHRs.

Understanding the effects of interventions is central to scientific progress, with randomized controlled trials (RCTs) regarded as the gold standard for causal inference in many applied fields. However, RCTs are costly, time-consuming, and often constrained by ethical or practical limitations, motivating the need for causal methods able to draw conclusions from observational data. While such data is collected at ever larger scale, making its use for causal inference is often hindered by the fact that not all variables affecting treatment allocation and the outcome are observed: an issue known as unobserved confounding. In this paper, we introduce a new study design called confounder detection via treatment intent. The idea is to query a human expert who makes treatment decisions, and ask them to compare pairs of units proposed by a principled matching strategy, with the goal of eliciting unobserved variables that explain why treatment decisions differ. We provide a theoretical basis for such a procedure, ascertaining conditions under which such a study design may elicit unobserved confounders. Building on this newly established foundations, we study treatment effects of interventions in the intensive care unit (ICU). First, we show empirical evidence strongly indicating that electronic health records (EHRs) collected in ICUs are subject to unobserved confounding. By using clinical text notes as a proxy for physicians' knowledge and leveraging natural language processing, we provide a proof of concept for our methodology in a semi-synthetic environment with a known ground truth.

View on arXiv PDF

Similar