LGDec 2, 2022

On the Change of Decision Boundaries and Loss in Learning with Concept Drift

arXiv:2212.01223v1h-index: 12
Originality Synthesis-oriented
AI Analysis

This work addresses the reliability of drift detection methods for machine learning practitioners, but it is incremental as it builds on existing ITTE-based approaches.

The paper investigates the mathematical justification of using interleaved test-train error (ITTE) for detecting concept drift, relating ITTE changes to real drift and optimal training results, supported by empirical evidence across algorithms and datasets.

The notion of concept drift refers to the phenomenon that the distribution generating the observed data changes over time. If drift is present, machine learning models may become inaccurate and need adjustment. Many technologies for learning with drift rely on the interleaved test-train error (ITTE) as a quantity which approximates the model generalization error and triggers drift detection and model updates. In this work, we investigate in how far this procedure is mathematically justified. More precisely, we relate a change of the ITTE to the presence of real drift, i.e., a changed posterior, and to a change of the training result under the assumption of optimality. We support our theoretical findings by empirical evidence for several learning algorithms, models, and datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes