MLLGApr 22

Online Survival Analysis: A Bandit Approach under Cox PH Model

arXiv:2604.2029630.5h-index: 2
Predicted impact top 55% in ML · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses the gap in applying survival analysis to online decision-making, such as optimizing treatments in healthcare, though it is an initial step and may be incremental in adapting existing bandit methods.

The paper tackled the problem of integrating survival analysis into online learning under the Cox PH model, addressing challenges like delayed feedback and censoring, and demonstrated through simulations and SEER data that their adapted bandit algorithms achieve sublinear regret bounds and learn near-optimal treatment policies effectively.

Survival analysis is a widely used statistical framework for modeling time-to-event data under censoring. Classical methods, such as the Cox proportional hazards (Cox PH) model, offer a semiparametric approach to estimating the effects of covariates on the hazard function. Despite its importance, survival analysis has been largely unexplored in online settings, particularly within the bandit framework, where decisions must be made sequentially to optimize treatments as new data arrive over time. In this work, we take an initial step toward integrating survival analysis into a purely online learning setting under the Cox PH model, addressing key challenges including staggered entry, delayed feedback, and right censoring. We adapt three canonical bandit algorithms to balance exploration and exploitation, with theoretical guarantees of sublinear regret bounds. Extensive simulations and semi-real experiments using SEER cancer data demonstrate that our approach enables rapid and effective learning of near-optimal treatment policies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes