Composite Quantile Regression With XGBoost Using the Novel Arctan Pinball Loss
This addresses a specific bottleneck in applying XGBoost to quantile regression for practitioners needing reliable conditional quantile estimates, though it is incremental as it builds on existing smooth approximation methods.
The paper tackled the problem of using XGBoost for composite quantile regression by introducing the arctan pinball loss, a smooth approximation that avoids issues with vanishing second derivatives, resulting in more efficient predictions and far fewer quantile crossings.
This paper explores the use of XGBoost for composite quantile regression. XGBoost is a highly popular model renowned for its flexibility, efficiency, and capability to deal with missing data. The optimization uses a second order approximation of the loss function, complicating the use of loss functions with a zero or vanishing second derivative. Quantile regression -- a popular approach to obtain conditional quantiles when point estimates alone are insufficient -- unfortunately uses such a loss function, the pinball loss. Existing workarounds are typically inefficient and can result in severe quantile crossings. In this paper, we present a smooth approximation of the pinball loss, the arctan pinball loss, that is tailored to the needs of XGBoost. Specifically, contrary to other smooth approximations, the arctan pinball loss has a relatively large second derivative, which makes it more suitable to use in the second order approximation. Using this loss function enables the simultaneous prediction of multiple quantiles, which is more efficient and results in far fewer quantile crossings.