BoXHED: Boosted eXact Hazard Estimator with Dynamic covariates
This provides a tool for dynamic health risk scoring from high-frequency medical data, addressing a need in clinical research, though it is incremental as it implements an existing estimator.
The paper introduces BoXHED, a software package for nonparametric hazard estimation using gradient boosting to predict disease onset from time-varying health vitals, and applying it to cardiovascular data from the Framingham Heart Study reveals novel interaction effects among risk factors.
The proliferation of medical monitoring devices makes it possible to track health vitals at high frequency, enabling the development of dynamic health risk scores that change with the underlying readings. Survival analysis, in particular hazard estimation, is well-suited to analyzing this stream of data to predict disease onset as a function of the time-varying vitals. This paper introduces the software package BoXHED (pronounced 'box-head') for nonparametrically estimating hazard functions via gradient boosting. BoXHED 1.0 is a novel tree-based implementation of the generic estimator proposed in Lee, Chen, Ishwaran (2017), which was designed for handling time-dependent covariates in a fully nonparametric manner. BoXHED is also the first publicly available software implementation for Lee, Chen, Ishwaran (2017). Applying BoXHED to cardiovascular disease onset data from the Framingham Heart Study reveals novel interaction effects among known risk factors, potentially resolving an open question in clinical literature.