LGMLDec 1, 2020

Analysis of Drifting Features

arXiv:2012.00499v13 citations
AI Analysis

This work provides a method for researchers and practitioners in data stream mining to pinpoint the specific features responsible for concept drift, aiding in more targeted model adaptation.

This paper addresses the problem of identifying features most relevant to concept drift in data streams. It introduces a distinction between drift-inducing features (which independently cause drift) and faithfully drifting features (which correlate with other drifting features), leading to minimal feature subsets that characterize the entire drift.

The notion of concept drift refers to the phenomenon that the distribution, which is underlying the observed data, changes over time. We are interested in an identification of those features, that are most relevant for the observed drift. We distinguish between drift inducing features, for which the observed feature drift cannot be explained by any other feature, and faithfully drifting features, which correlate with the present drift of other features. This notion gives rise to minimal subsets of the feature space, which are able to characterize the observed drift as a whole. We relate this problem to the problems of feature selection and feature relevance learning, which allows us to derive a detection algorithm. We demonstrate its usefulness on different benchmarks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes