Conformal Predictions under Markovian Data
This work addresses the challenge of reliable uncertainty quantification in sequential or time-series data for practitioners in fields like finance or healthcare, representing an incremental improvement by adapting existing methods to non-exchangeable settings.
The paper tackles the problem of applying split Conformal Prediction to Markovian data, quantifying the coverage gap due to correlations and showing it scales as √(t_mix ln(n)/n). It introduces K-split CP, which reduces the gap to t_mix/(n ln(n)) without significantly affecting prediction set size, as validated on synthetic and real-world datasets.
We study the split Conformal Prediction method when applied to Markovian data. We quantify the gap in terms of coverage induced by the correlations in the data (compared to exchangeable data). This gap strongly depends on the mixing properties of the underlying Markov chain, and we prove that it typically scales as $\sqrt{t_\mathrm{mix}\ln(n)/n}$ (where $t_\mathrm{mix}$ is the mixing time of the chain). We also derive upper bounds on the impact of the correlations on the size of the prediction set. Finally we present $K$-split CP, a method that consists in thinning the calibration dataset and that adapts to the mixing properties of the chain. Its coverage gap is reduced to $t_\mathrm{mix}/(n\ln(n))$ without really affecting the size of the prediction set. We finally test our algorithms on synthetic and real-world datasets.