ML LGMay 7

When Does Trimming Help Conformal Prediction? A Retained-Law Diagnostic under Calibration Contamination

arXiv:2605.0620435.9

AI Analysis

For practitioners using conformal prediction with potentially contaminated calibration data, this work clarifies the conditions under which trimming improves coverage guarantees.

The paper analyzes when trimming suspicious calibration points helps conformal prediction under contamination, showing that trimming reduces clean-target coverage via a retained-law diagnostic. It provides exact finite-sample identities and population-level diagnostics to determine when trimming is beneficial.

Trimming suspicious calibration points is a common response to contamination in conformal prediction. Its effect on clean-target coverage, however, is governed by the retained law induced by trimming, not by the contamination level alone. We analyse fixed-threshold trimming as conditioning rather than purification. It replaces the contaminated calibration law with a retained law, reducing clean-target coverage to a one-dimensional score-CDF transfer problem with an exact finite-sample identity. A componentwise bound on the transfer gap gives a population-level diagnostic. This separates a clean-side covariance cost from a retained-contamination cost, governed by the dirty-to-clean retention ratio. Trimming helps when the anomaly score separates retention probabilities while remaining score-neutral on the clean population. Otherwise, it cannot substantially reduce contamination through the retained mixture coefficient. We also give finite-sample certificate templates that provide numerical guarantees under independent audit.

View on arXiv PDF

Similar