FlatASCEND: Autoregressive Clinical Sequence Generation with Continuous Time Prediction and Association-Based Pharmacological Testing

Chris Sainsbury, Feng Dong, Andreas Karwath

arXiv:2605.0407114.91 citationsh-index: 3

AI Analysis

For clinical researchers, the paper provides a nuanced evaluation of autoregressive models for generating patient-conditioned clinical trajectories, highlighting limitations in causal distinction and reward exploitation.

FlatASCEND, a 14.5M-parameter autoregressive model for clinical sequence generation, shows that patient-specific conditioning amplifies known pharmacological associations (2.0-2.2x) while confounding-driven associations remain unchanged (0.9x), but only partially recovers correct mechanistic directions (4/10) under residual confounding, and direct preference optimization destroys all correct associations.

Autoregressive models can predict clinical events, but generating patient-conditioned multi-step trajectories that respond to intervention tokens and testing whether those responses preserve known pharmacological associations has received limited attention. We present FlatASCEND, a 14.5M-parameter autoregressive clinical sequence model using flat composite tokens and a zero-inflated log-normal time head. Standard distributional metrics (Jaccard 0.889-0.954) do not distinguish FlatASCEND from trivial baselines; the model's value lies in conditional generation from patient-specific prefixes. A prompt-shuffle ablation shows patient-specific conditioning amplifies mechanistic pharmacological effects (2.0-2.2x for steroid to glucose, diuretic to potassium) while leaving confounding-driven associations unchanged (0.9x for insulin to glucose). An incident-user framework assesses directional consistency against prior pharmacological knowledge on MIMIC-IV (N=500 per comparison): 4/10 recover correct mechanistic directions, 2 reproduce treatment-context associations, 4 are incorrect (9/10 significant, Wilcoxon p<0.05). This pattern - partial recovery under residual confounding - is consistent with learned observational associations without causal distinction. Direct preference optimisation with surrogate reward destroys all correct associations (3/3 to 0/3), illustrating reward exploitation when reward and evaluation share an outcome domain. Generative evidence is strongest for short-horizon ICU data; outpatient temporal fidelity is weaker (median 10 vs 154 days on INSPECT), and zero-shot cross-site transfer degrades without adaptation.

View on arXiv PDF

Similar