A Review of Open-World Learning and Steps Toward Open-World Learning Without Labels
This paper addresses the problem of learning new classes from unlabeled data in an open-world setting for AI agents, which is an incremental step in the field of machine learning.
This paper reviews open-world learning, a paradigm where an agent learns new classes from a non-stationary data stream, and formalizes various open-world learning problems, including one without labels. It proposes a framework combining modules for novelty detection, characterization, incremental learning, and instance management to learn new classes from unlabeled data in an unsupervised manner, defining seven baselines for this problem.
In open-world learning, an agent starts with a set of known classes, detects, and manages things that it does not know, and learns them over time from a non-stationary stream of data. Open-world learning is related to but also distinct from a multitude of other learning problems and this paper briefly analyzes the key differences between a wide range of problems including incremental learning, generalized novelty discovery, and generalized zero-shot learning. This paper formalizes various open-world learning problems including open-world learning without labels. These open-world problems can be addressed with modifications to known elements, we present a new framework that enables agents to combine various modules for novelty-detection, novelty-characterization, incremental learning, and instance management to learn new classes from a stream of unlabeled data in an unsupervised manner, survey how to adapt a few state-of-the-art techniques to fit the framework and use them to define seven baselines for performance on the open-world learning without labels problem. We then discuss open-world learning quality and analyze how that can improve instance management. We also discuss some of the general ambiguity issues that occur in open-world learning without labels.