LGFeb 8, 2023

Combining self-labeling and demand based active learning for non-stationary data streams

arXiv:2302.04141v1h-index: 12
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of handling data streams with limited labels in domains like social media or industrial monitoring, but it appears incremental as it builds on existing active learning and self-labeling methods.

The paper tackled the problem of learning from scarcely labeled, non-stationary data streams by proposing a novel online k-nn classifier that combines self-labeling and demand-based active learning, though no concrete results or numbers are provided.

Learning from non-stationary data streams is a research direction that gains increasing interest as more data in form of streams becomes available, for example from social media, smartphones, or industrial process monitoring. Most approaches assume that the ground truth of the samples becomes available (possibly with some delay) and perform supervised online learning in the test-then-train scheme. While this assumption might be valid in some scenarios, it does not apply to all settings. In this work, we focus on scarcely labeled data streams and explore the potential of self-labeling in gradually drifting data streams. We formalize this setup and propose a novel online $k$-nn classifier that combines self-labeling and demand-based active learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes