LGJul 27, 2022

Learning from Positive and Unlabeled Data with Augmented Classes

Zhongnian Li, Liutao Yang, Zhongchen Ma, Tongfeng Sun, Xinzheng Xu, Daoqiang Zhang

arXiv:2207.13274v11.8h-index: 13

Originality Incremental advance

AI Analysis

This addresses a real-world challenge in PU learning for scenarios with dynamic class distributions, though it appears incremental as it extends existing PU methods to handle augmented classes.

The paper tackles the problem of Positive Unlabeled (PU) learning in open and changing scenarios where unobserved augmented classes emerge during testing, proposing an unbiased risk estimator for PU learning with Augmented Classes (PUAC) that shows effectiveness in experiments on multiple realistic datasets.

Positive Unlabeled (PU) learning aims to learn a binary classifier from only positive and unlabeled data, which is utilized in many real-world scenarios. However, existing PU learning algorithms cannot deal with the real-world challenge in an open and changing scenario, where examples from unobserved augmented classes may emerge in the testing phase. In this paper, we propose an unbiased risk estimator for PU learning with Augmented Classes (PUAC) by utilizing unlabeled data from the augmented classes distribution, which can be easily collected in many real-world scenarios. Besides, we derive the estimation error bound for the proposed estimator, which provides a theoretical guarantee for its convergence to the optimal solution. Experiments on multiple realistic datasets demonstrate the effectiveness of proposed approach.

View on arXiv PDF

Similar