C-kNN-LSH: A Nearest-Neighbor Algorithm for Sequential Counterfactual Inference
This work addresses the challenge of sequential causal inference for clinical decision-making in complex conditions like long COVID, representing an incremental improvement with a novel method for a known bottleneck.
The paper tackles the problem of estimating causal effects from longitudinal trajectories in high-dimensional, confounded situations, such as long COVID recovery, by introducing C-kNN-LSH, a nearest-neighbor framework that uses locality-sensitive hashing to efficiently identify similar patients and integrates a doubly-robust correction for bias mitigation. The result shows superior performance in capturing recovery heterogeneity and estimating policy values on a real-world Long COVID cohort with 13,511 participants compared to existing baselines.
Estimating causal effects from longitudinal trajectories is central to understanding the progression of complex conditions and optimizing clinical decision-making, such as comorbidities and long COVID recovery. We introduce \emph{C-kNN--LSH}, a nearest-neighbor framework for sequential causal inference designed to handle such high-dimensional, confounded situations. By utilizing locality-sensitive hashing, we efficiently identify ``clinical twins'' with similar covariate histories, enabling local estimation of conditional treatment effects across evolving disease states. To mitigate bias from irregular sampling and shifting patient recovery profiles, we integrate neighborhood estimator with a doubly-robust correction. Theoretical analysis guarantees our estimator is consistent and second-order robust to nuisance error. Evaluated on a real-world Long COVID cohort with 13,511 participants, \emph{C-kNN-LSH} demonstrates superior performance in capturing recovery heterogeneity and estimating policy values compared to existing baselines.