Improving Conditional Coverage via Orthogonal Quantile Regression
This work addresses the issue of ensuring accurate prediction intervals across all feature regions for users in fields like statistics and machine learning, representing an incremental improvement over traditional quantile regression.
The paper tackles the problem of poor conditional coverage in prediction intervals from quantile regression by modifying the loss function to promote independence between interval size and miscoverage indicators, leading to improved conditional coverage as shown empirically.
We develop a method to generate prediction intervals that have a user-specified coverage level across all regions of feature-space, a property called conditional coverage. A typical approach to this task is to estimate the conditional quantiles with quantile regression -- it is well-known that this leads to correct coverage in the large-sample limit, although it may not be accurate in finite samples. We find in experiments that traditional quantile regression can have poor conditional coverage. To remedy this, we modify the loss function to promote independence between the size of the intervals and the indicator of a miscoverage event. For the true conditional quantiles, these two quantities are independent (orthogonal), so the modified loss function continues to be valid. Moreover, we empirically show that the modified loss function leads to improved conditional coverage, as evaluated by several metrics. We also introduce two new metrics that check conditional coverage by looking at the strength of the dependence between the interval size and the indicator of miscoverage.