LG MLApr 23, 2024

Interpretable Prediction and Feature Selection for Survival Analysis

arXiv:2404.14689v16.43 citationsh-index: 2KDD

Originality Incremental advance

AI Analysis

This addresses the need for accurate and interpretable survival models in healthcare, particularly for large datasets, though it is incremental as it builds on existing Generalized Additive Models.

The paper tackles the problem of building interpretable survival analysis models for healthcare by introducing DyS, a feature-sparse Generalized Additive Model that combines feature selection and interpretable prediction, achieving competitive performance with state-of-the-art models while maintaining high interpretability.

Survival analysis is widely used as a technique to model time-to-event data when some data is censored, particularly in healthcare for predicting future patient risk. In such settings, survival models must be both accurate and interpretable so that users (such as doctors) can trust the model and understand model predictions. While most literature focuses on discrimination, interpretability is equally as important. A successful interpretable model should be able to describe how changing each feature impacts the outcome, and should only use a small number of features. In this paper, we present DyS (pronounced ``dice''), a new survival analysis model that achieves both strong discrimination and interpretability. DyS is a feature-sparse Generalized Additive Model, combining feature selection and interpretable prediction into one model. While DyS works well for all survival analysis problems, it is particularly useful for large (in $n$ and $p$) survival datasets such as those commonly found in observational healthcare studies. Empirical studies show that DyS competes with other state-of-the-art machine learning models for survival analysis, while being highly interpretable.

View on arXiv PDF

Similar