LGMar 11

Fingerprinting Concepts in Data Streams with Supervised and Unsupervised Meta-Information

arXiv:2603.11094v14.59 citationsh-index: 7
Predicted impact top 83% in LG · last 90 daysOriginality Highly original
AI Analysis

This addresses concept drift for streaming data systems, offering a novel approach to improve adaptability, though it builds on existing representation ideas.

The paper tackles the problem of concept drift in data streams by proposing FiCSUM, a framework that uses fingerprints with many meta-information features to uniquely identify concepts, outperforming state-of-the-art methods in accuracy and drift modeling across 11 datasets.

Streaming sources of data are becoming more common as the ability to collect data in real-time grows. A major concern in dealing with data streams is concept drift, a change in the distribution of data over time, for example, due to changes in environmental conditions. Representing concepts (stationary periods featuring similar behaviour) is a key idea in adapting to concept drift. By testing the similarity of a concept representation to a window of observations, we can detect concept drift to a new or previously seen recurring concept. Concept representations are constructed using meta-information features, values describing aspects of concept behaviour. We find that previously proposed concept representations rely on small numbers of meta-information features. These representations often cannot distinguish concepts, leaving systems vulnerable to concept drift. We propose FiCSUM, a general framework to represent both supervised and unsupervised behaviours of a concept in a fingerprint, a vector of many distinct meta-information features able to uniquely identify more concepts. Our dynamic weighting strategy learns which meta-information features describe concept drift in a given dataset, allowing a diverse set of meta-information features to be used at once. FiCSUM outperforms state-of-the-art methods over a range of 11 real world and synthetic datasets in both accuracy and modeling underlying concept drift.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes