End-Cut Preference in Survival Trees
This addresses a specific bias in survival tree models, which is incremental but important for improving stability and interpretability in survival analysis.
The paper tackled the end-cut preference problem in survival trees, which causes imbalanced splits and instability, by proposing a smooth sigmoid surrogate approach, and demonstrated that it effectively mitigates this issue with theoretical and numerical evidence.
The end-cut preference (ECP) problem, referring to the tendency to favor split points near the boundaries of a feature's range, is a well-known issue in CART (Breiman et al., 1984). ECP may induce highly imbalanced and biased splits, obscure weak signals, and lead to tree structures that are both unstable and difficult to interpret. For survival trees, we show that ECP also arises when using greedy search to select the optimal cutoff point by maximizing the log-rank test statistic. To address this issue, we propose a smooth sigmoid surrogate (SSS) approach, in which the hard-threshold indicator function is replaced by a smooth sigmoid function. We further demonstrate, both theoretically and through numerical illustrations, that SSS provides an effective remedy for mitigating or avoiding ECP.