Positive-Unlabelled Survival Data Analysis
This work addresses the problem of biased results in traditional survival analysis when applied to a novel positive-unlabeled data setup, which is significant for researchers and practitioners dealing with such data.
This paper introduces a new framework for positive-unlabeled (PU) data in survival analysis, where positive data includes observed event times and unlabeled data includes censoring times with unknown event status. The authors developed parametric, nonparametric, and machine learning models for this framework, demonstrating through simulations that their proposed methods yield valid results, unlike traditional survival analysis which produces severely biased outcomes.
In this paper, we consider a novel framework of positive-unlabeled data in which as positive data survival times are observed for subjects who have events during the observation time as positive data and as unlabeled data censoring times are observed but whether the event occurs or not are unknown for some subjects. We consider two cases: (1) when censoring time is observed in positive data, and (2) when it is not observed. For both cases, we developed parametric models, nonparametric models, and machine learning models and the estimation strategies for these models. Simulation studies show that under this data setup, traditional survival analysis may yield severely biased results, while the proposed estimation method can provide valid results.