Conformal Margin Risk Minimization: An Envelope Framework for Robust Learning under Label Noise
This addresses the challenge of robust learning under label noise for machine learning practitioners, offering a method-agnostic solution without incremental reliance on existing mechanisms.
The paper tackles the problem of learning with noisy labels without requiring privileged knowledge, proposing Conformal Margin Risk Minimization (CMRM) as a plug-and-play framework that improves classification accuracy by up to +3.39% and reduces conformal prediction set size by up to -20.44% across various benchmarks.
Most methods for learning with noisy labels require privileged knowledge such as noise transition matrices, clean subsets or pretrained feature extractors, resources typically unavailable when robustness is most needed. We propose Conformal Margin Risk Minimization (CMRM), a plug-and-play envelope framework that improves any classification loss under label noise by adding a single quantile-calibrated regularization term, with no privileged knowledge or training pipeline modification. CMRM measures the confidence margin between the observed label and competing labels, and thresholds it with a conformal quantile estimated per batch to focus training on high-margin samples while suppressing likely mislabeled ones. We derive a learning bound for CMRM under arbitrary label noise requiring only mild regularity of the margin distribution. Across five base methods and six benchmarks with synthetic and real-world noise, CMRM consistently improves accuracy (up to +3.39%), reduces conformal prediction set size (up to -20.44%) and does not hurt under 0% noise, showing that CMRM captures a method-agnostic uncertainty signal that existing mechanisms did not exploit.