Learning and Naming Subgroups with Exceptional Survival Characteristics
This addresses the need for more flexible and interpretable subgroup discovery in fields like medicine and predictive maintenance, though it appears incremental as it builds on random survival forests.
The paper tackles the problem of identifying subpopulations with exceptional survival characteristics, such as longer or shorter survival times, by proposing Sysurv, a fully differentiable, non-parametric method that learns individual survival curves and interpretable rules, and empirical evaluation shows it reveals insightful and actionable subgroups in various datasets, including cancer data.
In many applications, it is important to identify subpopulations that survive longer or shorter than the rest of the population. In medicine, for example, it allows determining which patients benefit from treatment, and in predictive maintenance, which components are more likely to fail. Existing methods for discovering subgroups with exceptional survival characteristics require restrictive assumptions about the survival model (e.g. proportional hazards), pre-discretized features, and, as they compare average statistics, tend to overlook individual deviations. In this paper, we propose Sysurv, a fully differentiable, non-parametric method that leverages random survival forests to learn individual survival curves, automatically learns conditions and how to combine these into inherently interpretable rules, so as to select subgroups with exceptional survival characteristics. Empirical evaluation on a wide range of datasets and settings, including a case study on cancer data, shows that Sysurv reveals insightful and actionable survival subgroups.