LG MLNov 12, 2018

Learning from positive and unlabeled data: a survey

arXiv:1811.04820v3693 citations

Originality Synthesis-oriented

AI Analysis

It is a survey paper, so it is incremental in summarizing existing work for researchers and practitioners in machine learning.

This survey addresses the problem of learning from positive and unlabeled data, where only positive examples and unlabeled data are available, by reviewing the current state of the art and proposing key research questions in the field.

Learning from positive and unlabeled data or PU learning is the setting where a learner only has access to positive examples and unlabeled data. The assumption is that the unlabeled data can contain both positive and negative examples. This setting has attracted increasing interest within the machine learning literature as this type of data naturally arises in applications such as medical diagnosis and knowledge base completion. This article provides a survey of the current state of the art in PU learning. It proposes seven key research questions that commonly arise in this field and provides a broad overview of how the field has tried to address them.

View on arXiv PDF

Similar