Improving the statistical efficiency of cross-conformal prediction
This work addresses statistical efficiency in conformal prediction, offering incremental improvements for machine learning practitioners needing reliable uncertainty quantification.
The authors tackled the problem of wide prediction sets in cross-conformal prediction by proposing new variants that reduce set sizes while maintaining theoretical coverage guarantees, with simulations confirming these improvements.
Vovk (2015) introduced cross-conformal prediction, a modification of split conformal designed to improve the width of prediction sets. The method, when trained with a miscoverage rate equal to $α$ and $n \gg K$, ensures a marginal coverage of at least $1 - 2α- 2(1-α)(K-1)/(n+K)$, where $n$ is the number of observations and $K$ denotes the number of folds. A simple modification of the method achieves coverage of at least $1-2α$. In this work, we propose new variants of both methods that yield smaller prediction sets without compromising the latter theoretical guarantees. The proposed methods are based on recent results deriving more statistically efficient combination of p-values that leverage exchangeability and randomization. Simulations confirm the theoretical findings and bring out some important tradeoffs.