Bayesian Nonparametric Modeling of Heterogeneous Groups of Censored Data
This work addresses the challenge of analyzing censored data in statistics where groups are too heterogeneous to pool and too small to analyze individually, though it is incremental as it compares existing methods.
The paper tackled the problem of modeling time-to-event data from small, heterogeneous groups by comparing three Bayesian nonparametric methods—Dirichlet process, hierarchical Dirichlet process, and nested Dirichlet process—on simulated and real-world datasets to assess their accuracy in handling group differences in survival curves.
Datasets containing large samples of time-to-event data arising from several small heterogeneous groups are commonly encountered in statistics. This presents problems as they cannot be pooled directly due to their heterogeneity or analyzed individually because of their small sample size. Bayesian nonparametric modelling approaches can be used to model such datasets given their ability to flexibly share information across groups. In this paper, we will compare three popular Bayesian nonparametric methods for modelling the survival functions of heterogeneous groups. Specifically, we will first compare the modelling accuracy of the Dirichlet process, the hierarchical Dirichlet process, and the nested Dirichlet process on simulated datasets of different sizes, where group survival curves differ in shape or in expectation. We, then, will compare the models on a real-world injury dataset.