Learning with Complementary Labels Revisited: The Selected-Completely-at-Random Setting Is More Practical
This work addresses a weakly supervised learning problem for scenarios where traditional assumptions are impractical, offering a more flexible solution.
The paper tackles the problem of complementary-label learning by proposing a novel consistent approach that does not rely on uniform distribution assumptions or ordinary-label training sets, achieving superior performance over state-of-the-art methods on synthetic and real-world datasets.
Complementary-label learning is a weakly supervised learning problem in which each training example is associated with one or multiple complementary labels indicating the classes to which it does not belong. Existing consistent approaches have relied on the uniform distribution assumption to model the generation of complementary labels, or on an ordinary-label training set to estimate the transition matrix in non-uniform cases. However, either condition may not be satisfied in real-world scenarios. In this paper, we propose a novel consistent approach that does not rely on these conditions. Inspired by the positive-unlabeled (PU) learning literature, we propose an unbiased risk estimator based on the Selected-Completely-at-Random assumption for complementary-label learning. We then introduce a risk-correction approach to address overfitting problems. Furthermore, we find that complementary-label learning can be expressed as a set of negative-unlabeled binary classification problems when using the one-versus-rest strategy. Extensive experimental results on both synthetic and real-world benchmark datasets validate the superiority of our proposed approach over state-of-the-art methods.