Binary Classification with Confidence Difference
This addresses a weakly supervised learning challenge for scenarios where collecting detailed confidence labels is impractical, offering a novel method with proven theoretical guarantees.
The paper tackles the problem of binary classification with only pairwise confidence differences instead of pointwise labeling confidence, proposing risk-consistent and risk correction approaches that achieve optimal convergence rates and are validated on benchmark and real-world datasets.
Recently, learning with soft labels has been shown to achieve better performance than learning with hard labels in terms of model generalization, calibration, and robustness. However, collecting pointwise labeling confidence for all training examples can be challenging and time-consuming in real-world scenarios. This paper delves into a novel weakly supervised binary classification problem called confidence-difference (ConfDiff) classification. Instead of pointwise labeling confidence, we are given only unlabeled data pairs with confidence difference that specifies the difference in the probabilities of being positive. We propose a risk-consistent approach to tackle this problem and show that the estimation error bound achieves the optimal convergence rate. We also introduce a risk correction approach to mitigate overfitting problems, whose consistency and convergence rate are also proven. Extensive experiments on benchmark data sets and a real-world recommender system data set validate the effectiveness of our proposed approaches in exploiting the supervision information of the confidence difference.