Comparative Analysis of Extreme Verification Latency Learning Algorithms
This work provides a foundational review for researchers working on learning from non-stationary, unlabeled data streams after initial labeling, an incremental contribution to understanding existing methods.
This paper surveys and compares existing algorithms for Extreme Verification Latency (EVL), a challenging scenario where learning occurs from non-stationary, initially labeled data streams that subsequently become entirely unlabeled. The analysis evaluates algorithms based on classification accuracy, computational complexity, and parameter sensitivity across various synthetic and real-world datasets.
One of the more challenging real-world problems in computational intelligence is to learn from non-stationary streaming data, also known as concept drift. Perhaps even a more challenging version of this scenario is when -- following a small set of initial labeled data -- the data stream consists of unlabeled data only. Such a scenario is typically referred to as learning in initially labeled nonstationary environment, or simply as extreme verification latency (EVL). Because of the very challenging nature of the problem, very few algorithms have been proposed in the literature up to date. This work is a very first effort to provide a review of some of the existing algorithms (important/prominent) in this field to the research community. More specifically, this paper is a comprehensive survey and comparative analysis of some of the EVL algorithms to point out the weaknesses and strengths of different approaches from three different perspectives: classification accuracy, computational complexity and parameter sensitivity using several synthetic and real world datasets.